Barcode Symbologies [Design]

Prev: Low Freq. Capacitor Charger
Next: San Fransicko, Californica, pontificates with boycotts

From: Walter Banks on 25 May 2010 12:04

D Yuniskis wrote:

> Hi Walter,
>
> Walter Banks wrote:
> >
> >
> > It is actually unlikely the a foreign labile will conflict
> > with your identifier space. UPC is the classic case
> > user register the part *their* barcode that makes their
> > case unique.
> >
> > http://www.abblabels.com/Products/html/UPCBarcodeLabels.htm
>
> Yes, but *I* can opt to use UPC encoding to represent my
> data. Doing so suggests that, sooner or later, I will
> encounter an item with a "real" UPC barcode printed on it
> that my "system" will mistakenly take as "one of my own".
> I.e., a box of Cheerios coincidentally has thet same
> "identifier" (encoded into the body of the UPC label)
> as my "Capacitor, Electrolytic, 25uF, 16WVDC, radial".
> (the *scanner* can't tell that it's a box of cheerios!)

You can also opt to use UPC register and never mistake a
cap for a box of cheerios.

> > Think is the UPC registration for example as a preamble
> > followed by the actual information. The registration
> > has the added advantage of making available the
> > owner information available to most people using UPC
> > reader equipment. Registration makes your UPC record unique
> >
> > This process make false positives very rare.
>
> Only rare for folks who are part of that "registered identifier
> space".

In the UPC cas that is essentially all users are registered. The
registration cost is very low to encourage participation, the
benefits are data security and rare false positives The standard
UPC record checking will weed out essentially all the rest
of the unregistered UPC images..

UPC was developed to eliminate the problem that you
have been describing. A bunch of years ago the barcode
industry was filled with many different transport layers
and data formats most with a unique application area.
The only security was each company creating their
own encoding and data and record formats.

> Sort of like OUI's -- I can program *any* MAC into
> my NIC. But, there is no guarantee that I won't end up
> programming a MAC that isn't already in use, somewhere
> (belonging to some other organization).
>
> This is the problem I am trying to anticipate and avoid.
>
> > Code39 does not have the same registration process. In
> > most of the applications using Code39 record format
> > and check characters are sufficient to prevent undetected
> > errors.
> >
> > For example creating a Code 39 barcode record for
> > you application might have
> >
> > *DJ1234567855*
> >
> > would be a Code39 lable about 1.5" long with
> > DJ to identify it as yours
> > 8 Data digits
> > 55 two check digits.
> >
> > All records records you are interested would start with
> > DJ. Since the text is usually printed in the label as well
> > there would be a limited ability for false positives.
>
> Yes, this was my point about adding extra digits (I'm
> avoiding other characters simply because those other
> characters restrict you to using a smaller set of
> symbologies). But, you can't *know* (for sure) what
> "message/label formats" other people may use. I.e.,
> tomorrow, you may start doing business with a company who
> labels *all* of their products with "DJ" numbers. :-/
> (the longer the 'preface' or other "unique-ifying" data,
> the less the chance of random conflict ... "YUNISKIS"
> might be a suitably unique choice for a prefix! :> )

False positives are rare, once format, validation and context
are accounted for. For example, the automotive industry
has literally hundreds of suppliers using two or three
standard barcode encodings and don't have a problem
with false positives because of this.

> For example, Dell uses 6 character identifiers for their
> "service tags". What are the chances that using a similar
> format would result in eventually coming up with a "hit"
> that was unintended?

Actual Dell now uses a 7 character identifier for service
tags. However you just identified an important point
on identifying barcode, context. Dell uses several barcodes
on the bottom of there laptops top identify product,
manufacturer software licences and service tags.

> I wonder how many Code 128 labels
> have "SN" (Serial Number) as their first two characters?

Probably quite a few and a lot more Code39. Fewer
using the same field format for the actual serial number
fields which encode manufacture data, product id and
actual identifying number.

> I am, instead, looking to pick a symbology that is less
> likely to be encountered. *Or*, bend the rules regarding
> *my* choices of symbols such that "everyone else's"
> look invalid to me. I.e., you're advocating making them
> look invalid by relying on "DJ" to make them unique;
> I'm thinking, instead, of violating some inherent
> aspect of the label format -- check digit -- that they
> *wouldn't*... because they would want the scanner to
> do the check digit verification *for* them whereas I
> am willing to take on that task myself -- using a different
> "scheme".

Barcode formats are implemented in layers. Additional
checking in some cases is important. Do it. I am not
sure what you application is? Describing the application
may identify or reject some of the standard solutions.

Regards,

Walter..
--
Walter Banks
Byte Craft Limited
http://www.bytecraft.com

From: D Yuniskis on 25 May 2010 12:47

Hi Walter,

Walter Banks wrote:

[attributions elided]

>>> It is actually unlikely the a foreign labile will conflict
>>> with your identifier space. UPC is the classic case
>>> user register the part *their* barcode that makes their
>>> case unique.
>>>
>>> http://www.abblabels.com/Products/html/UPCBarcodeLabels.htm
>> Yes, but *I* can opt to use UPC encoding to represent my
>> data. Doing so suggests that, sooner or later, I will
>> encounter an item with a "real" UPC barcode printed on it
>> that my "system" will mistakenly take as "one of my own".
>> I.e., a box of Cheerios coincidentally has thet same
>> "identifier" (encoded into the body of the UPC label)
>> as my "Capacitor, Electrolytic, 25uF, 16WVDC, radial".
>> (the *scanner* can't tell that it's a box of cheerios!)
>
> You can also opt to use UPC register and never mistake a
> cap for a box of cheerios.

But that requires the person operating the register to
be aware of the distinction (pick something other
than caps and cheerios and repeat the exercise)

>>> Think is the UPC registration for example as a preamble
>>> followed by the actual information. The registration
>>> has the added advantage of making available the
>>> owner information available to most people using UPC
>>> reader equipment. Registration makes your UPC record unique
>>>
>>> This process make false positives very rare.
>> Only rare for folks who are part of that "registered identifier
>> space".
>
> In the UPC cas that is essentially all users are registered. The
> registration cost is very low to encourage participation, the
> benefits are data security and rare false positives The standard
> UPC record checking will weed out essentially all the rest
> of the unregistered UPC images..
>
> UPC was developed to eliminate the problem that you
> have been describing. A bunch of years ago the barcode
> industry was filled with many different transport layers
> and data formats most with a unique application area.
> The only security was each company creating their
> own encoding and data and record formats.

Exactly. Because the entire namespace ("message space"?)
was available to all users. Imagine if anyone could make up
any FQDN and tried to operate in the presence of other
folks (sharing a namespace).

The problem with the UPC route is it would cause you to use
up huge blocks of numbers needlessly. E.g., imagine if
UPS/FedEx used UPC labels as their "package identifiers".
These are essentially disposable identifiers yet putting
them into a shared/administered namespace would waste
big pieces of that namespace needlessly.

>> Sort of like OUI's -- I can program *any* MAC into
>> my NIC. But, there is no guarantee that I won't end up
>> programming a MAC that isn't already in use, somewhere
>> (belonging to some other organization).
>>
>> This is the problem I am trying to anticipate and avoid.
>>
>>> Code39 does not have the same registration process. In
>>> most of the applications using Code39 record format
>>> and check characters are sufficient to prevent undetected
>>> errors.
>>>
>>> For example creating a Code 39 barcode record for
>>> you application might have
>>>
>>> *DJ1234567855*
>>>
>>> would be a Code39 lable about 1.5" long with
>>> DJ to identify it as yours
>>> 8 Data digits
>>> 55 two check digits.
>>>
>>> All records records you are interested would start with
>>> DJ. Since the text is usually printed in the label as well
>>> there would be a limited ability for false positives.
>> Yes, this was my point about adding extra digits (I'm
>> avoiding other characters simply because those other
>> characters restrict you to using a smaller set of
>> symbologies). But, you can't *know* (for sure) what
>> "message/label formats" other people may use. I.e.,
>> tomorrow, you may start doing business with a company who
>> labels *all* of their products with "DJ" numbers. :-/
>> (the longer the 'preface' or other "unique-ifying" data,
>> the less the chance of random conflict ... "YUNISKIS"
>> might be a suitably unique choice for a prefix! :> )
>
> False positives are rare, once format, validation and context

Only if you can pick a unique combination of the above!
Doing this blindly has no guarantees. As I said, if I
arbitrarily picked a 6 character Code 128 identifier
and used that, eventually I *will* scan a label on a Dell
PC and find a coincidental match with some other object
in my database. Granted, you can recover from this.
But, it is a problem that need not happen if I had picked
some other scheme.

> are accounted for. For example, the automotive industry
> has literally hundreds of suppliers using two or three
> standard barcode encodings and don't have a problem
> with false positives because of this.

But I imagine they:
- don't let items come into their receiving dock without
rigidly defining how they are labeled; or
- instruct their staff to obscure any other barcodes
that may be on the packaging/items; or
- instruct their staff in *which* barcode is the "correct"
barcode to scan
- etc.

I.e., they can't just scan a barcode -- ANY BARCODE IN THE
BUILDING -- and be assured that it is *the* barcode that they
wanted to scan.

(they are also big enough to be able to enforce their wishes
on their suppliers -- "you will label your parts in this
fashion")

I can work around *some* problem areas by visual cues in
the labels. E.g., location identifiers are affixed to
blue plastic placards so a user "knows" this is a location
identifier and doesn't have to go hunting for it. But,
the software can't tell that the data scanned came from
a "blue plastic placard".

>> For example, Dell uses 6 character identifiers for their
>> "service tags". What are the chances that using a similar
>> format would result in eventually coming up with a "hit"
>> that was unintended?
>
> Actual Dell now uses a 7 character identifier for service

I stand corrected. I was looking at a license tag. <:-)

> tags. However you just identified an important point
> on identifying barcode, context. Dell uses several barcodes
> on the bottom of there laptops top identify product,
> manufacturer software licences and service tags.

Yes. So you either have to know which barcode to scan
or have to encode information in the label that lets
the system figure out which label you've scanned and
prompt you to scan some other (if it is awaiting some
specific piece of information).

I want to be able to have a label scanned and be able
to tell the user authoritatively that it is the "right"
label to scan and/or the right label 9item) he is looking
for.

>> I wonder how many Code 128 labels
>> have "SN" (Serial Number) as their first two characters?
>
> Probably quite a few and a lot more Code39. Fewer
> using the same field format for the actual serial number
> fields which encode manufacture data, product id and
> actual identifying number.

But 123-45-5678 from Dell means something different from
12-34-556-78 from T.I. (bogus examples). I.e. the
namespace (messagespace) isn't standardized. I suspect
folks like Dell make this work by having *lengthy* labels,
possibly with large hamming distances or lots of enforced
redundancy -- and by controlling the items that can
get "on the floor" *with* "foreign labels".

>> I am, instead, looking to pick a symbology that is less
>> likely to be encountered. *Or*, bend the rules regarding
>> *my* choices of symbols such that "everyone else's"
>> look invalid to me. I.e., you're advocating making them
>> look invalid by relying on "DJ" to make them unique;
>> I'm thinking, instead, of violating some inherent
>> aspect of the label format -- check digit -- that they
>> *wouldn't*... because they would want the scanner to
>> do the check digit verification *for* them whereas I
>> am willing to take on that task myself -- using a different
>> "scheme".
>
> Barcode formats are implemented in layers. Additional
> checking in some cases is important. Do it. I am not

I intend to. Beginning with verification of the symbology
used in each scanned label.

> sure what you application is? Describing the application
> may identify or reject some of the standard solutions.

Loosely speaking, inventory tracking. But, using the labels
throughout -- to identify parts, to identify stock locations,
to identify shipping manifests, to identify the individuals
picking/packaging/shipping the items, etc. Their just
manifestations of "universal identifiers". I just want to
minimize the chance of some *other* (foreign) label being
erroneously interpreted as "one of mine" and having the
process stop while folks sort out why the item the item they
seek isn't in "this" location ("Oh, you scanned a barcode
that *Digikey* had applied to a component; *not* the little
blue plastic placard that we use to identify locations!")

The Average Joe isn't going to understand the nuances
of "barcodology" :-/

From: Paul Hovnanian P.E. on 23 May 2010 14:26

D Yuniskis wrote:

[snip]
> Yes, that was my initial plan. Relying *solely* on that, however,
> leaves you open to "collisions" with other "unrelated labels"
> in that same symbology -- UNLESS YOU CAN GUARANTEE THAT YOUR
> LABELS WILL BE "INVALID" (violate some "given") WRT THOSE
> OTHER "valid" labels.

One way you can ensure non-collisions is to select a part of the UPC (or
whatever) code space that you are guaranteed not to see in your
application. For example, if you are developing an app for auto parts, code
everything as some sort of vegetable.

Your app will still break if you bring it into the grocery store. It might
also break if some third party has appropriated the same idea as you have
for labels that will appear in your domain.

--
Paul Hovnanian paul(a)hovnanian.com
----------------------------------------------------------------------
Have gnu, will travel.

From: 1 Lucky Texan on 25 May 2010 13:05

On May 25, 10:46 am, D Yuniskis <not.going.to...(a)seen.com> wrote:
> Hi Tony,
>
> Tony wrote:
> > Code 39 is very wasteful as it covers ascii, ITF (interleave 2 of 5)
> > would be more efficent space wise (if that is important to you)for
> > decimal only. More space = more characters and fixed digits.
>
> Yes, I was acknowledging this in my initial decision to
> just stick to pure numeric identifiers (they are just
> identifiers -- no need for them to have human readable
> content). This gives me the most choices between symbologies.
> Also *can* give me better character densities and/or
> decoding reliability (by storing less data in a given
> space *if* done properly).
>
> > I used this for a encryption system that needed to be programmed with
> > unique number and readable later on in the process. False reads were
>
> Not sure I understand your application; did the "programmer"
> then print a barcode label on the device? (which was
> later read)
>
> > important, although the reader was built into a test fixture. I also
>
> With a non contact scanner, you can usually do really good at
> reader accuracy (and first pass read rate should be damn near
> "six nines"). I've designed wand-type scanners that had to
> hit 99% FPRR and 99% accuracy (which is tough because the
> user can't be guaranteed to scan at a constant speed *and*
> the range of speeds varies over two or more orders of
> magnitude!)
>
> > used 3 (fixed) digits for a customer code, of which there were only 3 or
>
> OK, so you presumably chose those "valid" codes to maximize
> their hamming distances?
>
> > 4 customers. The test fixture was set for the appropriate customer at
> > the start of a shift, non matches where rejected.
>
> > The fixed parts help reduce the possibility of a false hit and You can
> > further reduce your chances of a false read by fixing the number of
> > digits in the code. Most scanner can be programmed for this. They can
>
> I can handle that in the system software. I.e., I can validate
> any identifiers as "mine" by checking:
> - symbology used (scanner can tell me this)
> - message format (digits, preamble, check digit, etc.)
> - "does the identifier exist in the database"?
>
> > also be programmed whether there the last digit is a check digit or not,
> > and whether to transmit this.
>
> Is this true of all scanners? I have found it to be the case
> of the few that I have examined but haven't seen that as a
> "guaranteed feature" (i.e., does The Industry require this
> as a "basic feature" or is it an enhancement offered by many/all
> scanner vendors?)
>
> > I also used bearer bars top and bottom to help reduce bad reads further..
>
> Ah, good point! Though I would think that an examination of
> the "read data" could give you this information, as well
> (i.e., short read, bad check digit, etc.)
>
> My problem will be making sure some *other* label isn't
> accidentally scanned AS IF it was "mine". E.g., I chuckle
> watching folks at the "self check-out" at the library
> scanning their books and getting frustrated because the
> system doesn't recognize the barcode (because they are
> scanning the EAN code instead of the library's *specific*
> "item number" label). Poorly designed system has the scanner
> beeping even on bad scans (so folks who just listen for the
> beep wonder why their receipt doesn't have all the items
> listed on it!) as well as failing to inform the patron that
> "you've scanned the wrong label" (since the system could
> easily know that the scanner just saw an EAN label instead
> of the library's specific label!)
>
> [people who "assemble" systems from OTS subsystems often don't
> think things through completely, IME]

If it is extremely important to decrease the odds of scanning a false
positive, perhaps there could be special lighting or filters used.
maybe a UV illuminated label or a specific color/special inks?

this may be overkill as capacity - but might be easily recognized as
the 'correct' label due to color;
http://en.wikipedia.org/wiki/High_Capacity_Color_Barcode

From: Walter Banks on 25 May 2010 14:11

D Yuniskis wrote:

> The problem with the UPC route is it would cause you to use
> up huge blocks of numbers needlessly. E.g., imagine if
> UPS/FedEx used UPC labels as their "package identifiers".
> These are essentially disposable identifiers yet putting
> them into a shared/administered namespace would waste
> big pieces of that namespace needlessly.

UPC registration comes in two parts. A registered
part identifying the owner and a block of numbers
for the registered owner to use.

> > False positives are rare, once format, validation and context
>
> Only if you can pick a unique combination of the above!
> Doing this blindly has no guarantees. As I said, if I
> arbitrarily picked a 6 character Code 128 identifier
> and used that, eventually I *will* scan a label on a Dell
> PC and find a coincidental match with some other object
> in my database. Granted, you can recover from this.
> But, it is a problem that need not happen if I had picked
> some other scheme.

Barcodes can be made arbitrarily reliable and unique by
trading information space for reliability. Using a standard
transport layer and modest amount of validation of the
record layer can be very reliable. Changing barcode
format to an obscure format will not change the
overall reliability very much.

In a well implemented barcode system false positives
are rare.

> Yes. So you either have to know which barcode to scan
> or have to encode information in the label that lets
> the system figure out which label you've scanned and
> prompt you to scan some other (if it is awaiting some
> specific piece of information).
>
> I want to be able to have a label scanned and be able
> to tell the user authoritatively that it is the "right"
> label to scan and/or the right label 9item) he is looking
> for.

In your proposal, some of the symbol space is translated
into the transportation layer. It is a tradoff but not a
significant one.

> >> I wonder how many Code 128 labels
> >> have "SN" (Serial Number) as their first two characters?
> >
> > Probably quite a few and a lot more Code39. Fewer
> > using the same field format for the actual serial number
> > fields which encode manufacture data, product id and
> > actual identifying number.
>
> But 123-45-5678 from Dell means something different from
> 12-34-556-78 from T.I. (bogus examples). I.e. the
> namespace (messagespace) isn't standardized. I suspect
> folks like Dell make this work by having *lengthy* labels,
> possibly with large hamming distances or lots of enforced
> redundancy -- and by controlling the items that can
> get "on the floor" *with* "foreign labels".

The Dell barcodes are on the bottom of most laptops, most
are actually quite short. They have made the choice to scan
the appropriate label. Interesting enough the 6 or 7 character
service tag is still not likely to be miss read. It is has an alpha
numeric format with record type information built into the
order and range of the alphanumeric characters.

> Loosely speaking, inventory tracking. But, using the labels
> throughout -- to identify parts, to identify stock locations,
> to identify shipping manifests, to identify the individuals
> picking/packaging/shipping the items, etc. Their just
> manifestations of "universal identifiers". I just want to
> minimize the chance of some *other* (foreign) label being
> erroneously interpreted as "one of mine" and having the
> process stop while folks sort out why the item the item they
> seek isn't in "this" location ("Oh, you scanned a barcode
> that *Digikey* had applied to a component; *not* the little
> blue plastic placard that we use to identify locations!")

What you are trying to do is a common problem. Making
the barcode unreadable to most readers is one solution
but there are many other effective solutions most related
to either central registries or common transport layer
and symbols with individual record format.

False reads of the kind you describe almost never happens
in a real environment. Too many simultaneous failures
would need to happen. Barcode standards evolved in
order to prevent chaos of this type.

Regards,

Walter..
--
Walter Banks
Byte Craft Limited
http://www.bytecraft.com

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9
Prev: Low Freq. Capacitor Charger
Next: San Fransicko, Californica, pontificates with boycotts