CAS numbers are used to uniquely identify every chemical man can find or dream up. There are a staggering 60,000,000+ in the CAS number registry and 12,000 more are added daily. These numbers have a specific format mask with a check digit at the end. For a project I needed a simple client side CAS# format validation routine, where it checked that the format is legit and the check digit is correct, as opposed to looking up the CAS# in a 60-million row database.

Industry-specific shared code is difficult to find, most of the time it is trapped inside corporate walls. A while ago I searched for a validation routine for the format and checksum of CAS numbers, and although there is one out there now, at the time it had to be written from scratch by anyone who had the need.

I intend to put up more long tail code snipits over time, but to be honest I’m motivated at the moment more by wanting sample code to play with the Google Code Prettifier from my last post.

First in Javascript for client-side validation:

// For a properly formed CAS# string, return true or false  
//  depending on if the check digit passes the test.
//  For improperly formed CAS#s, return NaN.
function checkCas(cas) {
    // Check the string against the mask 
    // 1-7 digits, dash, 2 digits, dash, check digit
    if (! (/^\d{1,7}-\d{2}-\d$/.test(cas))) {
        return NaN;
    } else {
        // Remove the dashes
        cas = cas.replace(/\-/g,"");
        // Although the definition is usually expressed as a 
        // right-to-left fn, this works left-to-right.
        // Note the loop stops one character shy of the end.
        var sum = 0;
        for (var indx=0; indx < cas.length-1; indx++) {
            sum+=(cas.length-indx-1)*parseInt(cas.charAt(indx));
        }
        // Check digit is the last char, compare to sum mod 10.
        return parseInt(cas.charAt(cas.length-1)) == (sum % 10);
    }
}

I also converted it to Java for some server-side processing:

/**
 * For a properly formed CAS# string, return true or false
 * depending on if the check digit passes the test.
 * For improperly formed CAS#s throw a ParseException.
 *
 * @param cas The string of the CAS#
 * @return true if the check digit matches.
 */
public static boolean checkCas(String cas) throws ParseException {
    // Check the string against the mask 
    // 1-7 digits, dash, 2 digits, dash, check digit
    if (! cas.matches("^\\d{1,7}-\\d{2}-\\d$")) {
        throw new ParseException(cas,0);
    } else {
        // Remove the dashes
        cas = cas.replaceAll("-","");
        // Although the definition is usually expressed as a 
        // right-to-left fn, this works left-to-right.
        // Note the loop stops one character shy of the end.
        int sum = 0;
        for (int indx=0; indx < cas.length()-1; indx++) {
            sum += (cas.length()-indx-1)*Integer.parseInt(cas.substring(indx,indx+1));
        }
        // Check digit is the last char, compare to sum mod 10.
        return Integer.parseInt(cas.substring(cas.length()-1)) == (sum % 10);
    }
}

Also, for the Java version don’t forget:

import java.text.ParseException;

So that’s about it, just 10 lines of code x2 for today’s post.

If you use it or improve upon it, please let me know (although I don’t believe you should be required to – and you aren’t under the chosen license – it’s still a nice thing to do). Also if you need a waiver from attribution let me know.