1
0
mirror of https://git.code.sf.net/p/zint/code synced 2025-12-18 02:17:06 +00:00

Integrate GS1 Syntax Engine

This commit is contained in:
gitlost
2025-09-12 04:20:55 +01:00
parent ad95d8f2b0
commit 0650d5798e
32 changed files with 2109 additions and 723 deletions

View File

@@ -125,7 +125,7 @@ vector
## 2.1 Linux
The easiest way to configure compilation is to take advantage of the CMake
utilities. You will need to install CMake and `libpng-dev` first. For instance
utilities. You will need to install CMake, and `libpng-dev` first. For instance
on `apt` systems:
```bash
@@ -133,8 +133,9 @@ sudo apt install git cmake build-essential libpng-dev
```
If you want to take advantage of Zint Barcode Studio you will also need to have
Qt and its component `"Desktop gcc 64-bit"` installed, as well as `mesa`. For
details see `"README.linux"` in the project root directory.
Qt and its component `"Desktop gcc 64-bit"` installed, as well as `mesa`. Other
steps are required to avail of the GS1 Syntax Engine.[^1] For details see
`"README.linux"` in the project root directory.
Once you have fulfilled these requirements unzip the source code tarball or
clone the latest source
@@ -175,6 +176,12 @@ the `"frontend"` sub-directory. To run the test type
This should create numerous files in the sub-directory `"frontend/test_sh_out"`
showing the many modes of operation which are available from Zint.
[^1]: The GS1 Syntax Engine (`gs1encoders` library), which is officially
sanctioned by GS1, offers strict validation of GS1 data, including GS1 Digital
Link URIs - see "GS1 Barcode Syntax Engine" at
[https://github.com/gs1/gs1-syntax-engine](
https://github.com/gs1/gs1-syntax-engine).
## 2.2 BSD
The latest Zint CLI, `libzint` library and GUI can be installed from the `zint`
@@ -216,7 +223,7 @@ To build Zint on Windows from source, see `"win32/README"`.
## 2.4 Apple macOS
The latest Zint CLI and `libzint` can be installed using Homebrew.[^1] To
The latest Zint CLI and `libzint` can be installed using Homebrew.[^2] To
install Homebrew input the following line into the macOS terminal
```bash
@@ -234,7 +241,7 @@ brew install zint
To build from source (and install the GUI) see `"README.macos"` in the project
root directory.
[^1]: See the Homebrew website [https://brew.sh](https://brew.sh).
[^2]: See the Homebrew website [https://brew.sh](https://brew.sh).
## 2.5 Zint Tcl Backend
@@ -564,7 +571,7 @@ Sequence Equivalent
`\xNN` 0xNN Any 8-bit character where NN is hexadecimal
(00-FF)
`\uNNNN` Any 16-bit Unicode BMP[^2] character where
`\uNNNN` Any 16-bit Unicode BMP[^3] character where
NNNN is hexadecimal (0000-FFFF)
`\UNNNNNN` Any 21-bit Unicode character where NNNNNN
@@ -573,7 +580,7 @@ Sequence Equivalent
Table: {#tbl:escape_sequences tag=": Escape Sequences"}
[^2]: In Unicode contexts, BMP stands for Basic Multilingual Plane, the plane 0
[^3]: In Unicode contexts, BMP stands for Basic Multilingual Plane, the plane 0
codeset from U+0000 to U+D7FF and U+E000 to U+FFFF (i.e. excluding surrogates).
Not to be confused with the Windows Bitmap file format BMP!
@@ -658,7 +665,7 @@ Names are treated case-insensitively by the CLI, and the `BARCODE_` prefix and
any underscores are optional.
-----------------------------------------------------------------------------
Numeric Name[^3] Barcode Name
Numeric Name[^4] Barcode Name
Value
------- -------------------------- ---------------------------------------
1 `BARCODE_CODE11` Code 11
@@ -884,7 +891,7 @@ Value
Table: {#tbl:barcode_types tag=": Barcode Types (Symbologies)"}
[^3]: The symbology names marked with an asterisk (`*`) in Table
[^4]: The symbology names marked with an asterisk (`*`) in Table
{@tbl:barcode_types} above used different names in previous versions of Zint.
These names are now deprecated but are still recognised by Zint. Those marked
with a dagger (`†`) are replacements for `BARCODE_EANX` (13), `BARCODE_EANX_CHK`
@@ -1058,7 +1065,7 @@ zint --bg=ff0000 --fg=ffffff00 ...
will give different results for PNG and SVG. Experimentation is advised!
In addition the `--nobackground` option will remove the background from all
output formats except BMP.[^4]
output formats except BMP.[^5]
The `--cmyk` option is specific to output in Encapsulated PostScript (EPS) and
TIF, and selects the CMYK colour space. Custom colours should then usually be
@@ -1066,7 +1073,7 @@ given in the comma-separated `"C,M,Y,K"` format, where `C`, `M`, `Y` and `K` are
expressed as decimal percentage values from 0 to 100. RGB values may still be
used, in which case they will be converted formulaically to CMYK approximations.
[^4]: The background is omitted for vector outputs EMF, EPS and SVG when
[^5]: The background is omitted for vector outputs EMF, EPS and SVG when
`--nobackground` is given. For raster outputs GIF, PCX, PNG and TIF, the
background's alpha channel is set to zero (fully transparent).
@@ -1193,7 +1200,7 @@ zint -b MAXICODE -d "MaxiCode (19 chars)" --scalexdimdp=0,600dpi
## 4.10 Human Readable Text (HRT) Options
For linear barcodes the text present[^5] in the output image can be removed by
For linear barcodes the text present[^6] in the output image can be removed by
using the `--notext` option. Note also that for raster output text will not be
printed for scales less than 1 (see [4.9 Adjusting Image Size (X-dimension)]).
@@ -1217,7 +1224,7 @@ for all others) can be embedded in the file for portability using the
![`zint -d "Áccent" --embedfont`](images/code128_embedfont.svg){.lin}
[^5]: For linear barcodes, Human Readable Text (HRT) is not shown for the postal
[^6]: For linear barcodes, Human Readable Text (HRT) is not shown for the postal
codes Australia Post (all variants), USPS Intelligent Mail, POSTNET and PLANET,
Brazilian CEPNet, Royal Mail 4-State Customer Code and 4-State Mailmark, Dutch
Post KIX Code, Japanese Postal Code, DAFT Code and FIM, the pharma codes
@@ -1253,7 +1260,7 @@ Grid Matrix GB 2312 (includes ASCII) N/A
Han Xin Latin-1 GB 18030 (includes ASCII)
MaxiCode Latin-1 None
MicroPDF417 Latin-1 None
Micro QR Code Latin-1 Shift JIS (includes ASCII[^6])
Micro QR Code Latin-1 Shift JIS (includes ASCII[^7])
PDF417 Latin-1 None
QR Code Latin-1 Shift JIS (see above)
rMQR Latin-1 Shift JIS (see above)
@@ -1263,7 +1270,7 @@ All others ASCII N/A
Table: {#tbl:default_character_sets tag=": Default Character Sets"}
[^6]: Shift JIS (JIS X 0201 Roman) re-maps two ASCII characters: backslash (`\`)
[^7]: Shift JIS (JIS X 0201 Roman) re-maps two ASCII characters: backslash (`\`)
to the yen sign (¥), and tilde (`~`) to overline (U+203E).
If Zint encounters characters which can not be encoded using the default
@@ -1273,10 +1280,12 @@ Interpretations) mechanism to encode the data if the symbology supports it - see
GS1 data can be encoded in a number of symbologies. Application Identifiers
(AIs) should be enclosed in `[square brackets]` followed by the data to be
encoded (see [6.1.10.3 GS1-128]). To encode GS1 data use the `--gs1` option.
GS1 mode is assumed (and doesn't need to be set) for GS1-128, EAN-14, GS1
DataBar and GS1 Composite symbologies but is also available for Aztec Code, Code
16K, Code 49, Code One, Data Matrix, DotCode, QR Code and Ultracode.
encoded (see [6.1.10.3 GS1-128]). GS1 Digital Link URIs are also supported. To
encode GS1 data use the `--gs1` option. Also recommended is the `--gs1strict`
option, which verifies the GS1 data. GS1 mode is assumed (and doesn't need to be
set) for GS1-128, EAN-14, GS1 DataBar and GS1 Composite symbologies but is also
available for Aztec Code, Code 16K, Code 49, Code One, Data Matrix, DotCode, QR
Code and Ultracode.
Health Industry Barcode (HIBC) data may also be encoded in the symbologies Aztec
Code, Codablock-F, Code 128, Code 39, Data Matrix, MicroPDF417, PDF417 and QR
@@ -1358,12 +1367,12 @@ ECI Code Character Encoding Scheme (ISO/IEC 8859 schemes include ASCII)
33 UTF-16LE (Low order byte first)
34 UTF-32BE (High order bytes first)
35 UTF-32LE (Low order bytes first)
170 ISO/IEC 646 Invariant[^7]
170 ISO/IEC 646 Invariant[^8]
899 8-bit binary data
Table: {#tbl:eci_codes tag=": ECI Codes"}
[^7]: ISO/IEC 646 Invariant is a subset of ASCII with 12 characters undefined:
[^8]: ISO/IEC 646 Invariant is a subset of ASCII with 12 characters undefined:
`#`, `$`, `@`, `[`, `\`, `]`, `^`, `` ` ``, `{`, `|`, `}`, `~` (tilde).
An ECI value of 0 does not encode any ECI information in the code symbol (unless
@@ -1923,10 +1932,10 @@ int main(int argc, char **argv)
```
will print the SVG output to `stdout` (the file `"mem.svg"` is not created).
This is particularly useful for the textual formats EPS and SVG,[^8] allowing
This is particularly useful for the textual formats EPS and SVG,[^9] allowing
the output to be manipulated and processed by the client.
[^8]: BARCODE_MEMORY_FILE textual formats EPS and SVG will have Unix newlines
[^9]: BARCODE_MEMORY_FILE textual formats EPS and SVG will have Unix newlines
(LF) on both Windows and Unix, i.e. not CR+LF on Windows.
## 5.7 Setting Options
@@ -1947,7 +1956,7 @@ Member Name Type Meaning Default Value
`height` float Symbol height in Symbol dependent
X-dimensions, excluding
fixed width-to-height
symbols.[^9]
symbols.[^10]
`scale` float Scale factor for 1.0
adjusting size of image
@@ -1997,7 +2006,7 @@ Member Name Type Meaning Default Value
`.eps`, `.pcx`, `.svg`,
`.tif` or `.txt` followed
by a terminating
`NUL`.[^10]
`NUL`.[^11]
`primary` character Primary message data for `""` (empty)
string more complex symbols,
@@ -2124,13 +2133,13 @@ Member Name Type Meaning Default Value
Table: API Structure `zint_symbol` {#tbl:api_structure_zint_symbol tag="$ $"}
[^9]: The `height` value is ignored for Aztec (including HIBC and Aztec Rune),
[^10]: The `height` value is ignored for Aztec (including HIBC and Aztec Rune),
Code One, Data Matrix (including HIBC), DotCode, Grid Matrix, Han Xin, MaxiCode,
QR Code (including HIBC, Micro QR, rMQR and UPNQR), and Ultracode - all of which
have a fixed width-to-height ratio (or, in the case of Code One, a fixed
height).
[^10]: For Windows, `outfile` is assumed to be UTF-8 encoded.
[^11]: For Windows, `outfile` is assumed to be UTF-8 encoded.
To alter these values use the syntax shown in the example below. This code has
the same result as the previous example except the output is now taller and
@@ -2302,10 +2311,10 @@ Value Effect
------------------------- ---------------------------------------------------
0 No options selected.
`BARCODE_BIND_TOP` Boundary bar above the symbol only.[^11]
`BARCODE_BIND_TOP` Boundary bar above the symbol only.[^12]
`BARCODE_BIND` Boundary bars above and below the symbol and
between rows if stacking multiple symbols.[^12]
between rows if stacking multiple symbols.[^13]
`BARCODE_BOX` Add a box surrounding the symbol and whitespace.
@@ -2332,7 +2341,7 @@ Value Effect
Symbols in Memory (raster)].
`BARCODE_QUIET_ZONES` Add compliant quiet zones (additional to any
specified whitespace).[^13]
specified whitespace).[^14]
`BARCODE_NO_QUIET_ZONES` Disable quiet zones, notably those with defaults.
@@ -2354,13 +2363,13 @@ Value Effect
Table: API `output_options` Values {#tbl:api_output_options tag="$ $"}
[^11]: The `BARCODE_BIND_TOP` flag is set by default for DPD - see [6.1.10.7 DPD
[^12]: The `BARCODE_BIND_TOP` flag is set by default for DPD - see [6.1.10.7 DPD
Code].
[^12]: The `BARCODE_BIND` flag is always set for Codablock-F, Code 16K and Code
[^13]: The `BARCODE_BIND` flag is always set for Codablock-F, Code 16K and Code
49. Special considerations apply to ITF-14 - see [6.1.2.6 ITF-14].
[^13]: Codablock-F, Code 16K, Code 49, EAN-13, EAN-8, EAN/UPC add-ons, ISBN,
[^14]: Codablock-F, Code 16K, Code 49, EAN-13, EAN-8, EAN/UPC add-ons, ISBN,
ITF-14, UPC-A and UPC-E have compliant quiet zones added by default.
## 5.11 Setting the Input Mode
@@ -2369,49 +2378,52 @@ The way in which the input data is encoded can be set using the `input_mode`
member. Valid values are shown in the table below.
------------------------------------------------------------------------------
Value Effect
------------------ ----------------------------------------------------------
`DATA_MODE` Uses full 8-bit range interpreted as binary data.
Value Effect
---------------------- ------------------------------------------------------
`DATA_MODE` Uses full 8-bit range interpreted as binary data.
`UNICODE_MODE` Uses UTF-8 input.
`UNICODE_MODE` Uses UTF-8 input.
`GS1_MODE` Encodes GS1 data using `FNC1` characters.
`GS1_MODE` Encodes GS1 data using `FNC1` characters.
_The above are exclusive, the following optional and
OR-ed._
_The above are exclusive, the following optional and
OR-ed._
`ESCAPE_MODE` Process input data for escape sequences.
`ESCAPE_MODE` Process input data for escape sequences.
`GS1PARENS_MODE` Parentheses (round brackets) used in GS1 data instead of
square brackets to delimit Application Identifiers
(parentheses in the data must be escaped and `ESCAPE_MODE`
selected).
`GS1PARENS_MODE` Parentheses (round brackets) used in GS1 data instead
of square brackets to delimit Application Identifiers
(parentheses in the data must be escaped).
`GS1NOCHECK_MODE` Do not check GS1 data for validity, i.e. suppress checks
for valid AIs and data lengths. Invalid characters (e.g.
control characters, extended ASCII characters) are still
checked for.
`GS1NOCHECK_MODE` Do not check GS1 data for validity, i.e. suppress
checks for valid AIs and data lengths. Invalid
characters (e.g. control characters, extended ASCII
characters) are still checked for.
`HEIGHTPERROW_MODE` Interpret the `height` member as per-row rather than as
overall height.
`HEIGHTPERROW_MODE` Interpret the `height` member as per-row rather than
as overall height.
`FAST_MODE` Use faster if less optimal encodation or other shortcuts
if available (affects `DATAMATRIX`, `MICROPDF417`,
`PDF417`, `QRCODE` and `UPNQR` only).
`FAST_MODE` Use faster if less optimal encodation or other
shortcuts if available (affects `DATAMATRIX`,
`MICROPDF417`, `PDF417`, `QRCODE` and `UPNQR` only).
`EXTRA_ESCAPE_MODE` Process special symbology-specific escape sequences
(`CODE128` only).
`EXTRA_ESCAPE_MODE` Process special symbology-specific escape sequences
(`CODE128` only).
`GS1SYNTAXENGINE_MODE` Use the GS1 Syntax Engine (if available) to strictly
validate GS1 input.
------------------------------------------------------------------------------
Table: API `input_mode` Values {#tbl:api_input_mode tag="$ $"}
The default mode is `DATA_MODE`. (Note that this differs from the default for
the CLI and GUI, which is `UNICODE_MODE`.)
The default mode is `DATA_MODE` (CLI option `--binary`). (Note that this differs
from the default for the CLI and GUI, which is `UNICODE_MODE`.)
`DATA_MODE`, `UNICODE_MODE` and `GS1_MODE` are mutually exclusive, whereas
`ESCAPE_MODE`, `GS1PARENS_MODE`, `GS1NOCHECK_MODE`, `HEIGHTPERROW_MODE`,
`FAST_MODE` and `EXTRA_ESCAPE_MODE` are optional. So, for example, you can set
`FAST_MODE`, `EXTRA_ESCAPE_MODE` and `GS1SYNTAXENGINE_MODE` are optional. So,
for example, you can set
```c
my_symbol->input_mode = UNICODE_MODE | ESCAPE_MODE;
@@ -2436,13 +2448,19 @@ Permissible escape sequences (`ESCAPE_MODE`) are listed in Table
escape sequences are given in [6.1.10.1 Standard Code 128 (ISO 15417)]. An
example of `GS1PARENS_MODE` usage is given in section [6.1.10.3 GS1-128].
`GS1NOCHECK_MODE` is for use with legacy systems that have data that does not
conform to the current GS1 standard. Printable ASCII input is still checked for,
as is the validity of GS1 data specified without AIs (e.g. linear data for GS1
DataBar Omnidirectional/Limited/etc.). Also checked is GS1 DataBar Expanded and
GS1 Composite input that is not in the GS1 encodable character set 82 (see GS1
General Specifications Figure 7.11.1 'GS1 AI encodable character set 82'),
otherwise encodation would fail.
`GS1NOCHECK_MODE` (CLI `--gs1nocheck`) is for use with legacy systems that have
data that does not conform to the current GS1 standard. Printable ASCII input is
still checked for, as is the validity of GS1 data specified without AIs (e.g.
linear data for GS1 DataBar Omnidirectional/Limited/etc.). Also checked is GS1
DataBar Expanded and GS1 Composite input that is not in the GS1 encodable
character set 82 (see GS1 General Specifications Figure 7.11.1 'GS1 AI encodable
character set 82'), otherwise encodation would fail.
In contrast `GS1SYNTAXENGINE_MODE` (CLI `--gs1strict`) enables the use the GS1
Syntax Engine to strictly validate GS1 data, including GS1 Digital Link URIs (by
default ZINT does not validate Digital Links at all). It requires that the
`gs1encoders` library was present when Zint was built, otherwise the default
built-in validation will be used.
For `HEIGHTPERROW_MODE`, see `--heightperrow` in section [4.4 Adjusting Height].
The `height` member should be set to the desired per-row value on input (it will
@@ -2618,7 +2636,7 @@ Value Meaning
`ZINT_CAP_STACKABLE` Is the symbology stackable? Note that stacked
symbologies are not stackable.
`ZINT_CAP_EANUPC`[^14] Is the symbology EAN/UPC?
`ZINT_CAP_EANUPC`[^15] Is the symbology EAN/UPC?
`ZINT_CAP_COMPOSITE` Does the symbology support composite data? (see
[6.3 GS1 Composite Symbols (ISO 24723)] below)
@@ -2654,7 +2672,7 @@ Value Meaning
Table: {#tbl:api_cap tag=": API Capability Flags"}
[^14]: `ZINT_CAP_EANUPC` was previously named `ZINT_CAP_EXTENDABLE`, which is
[^15]: `ZINT_CAP_EANUPC` was previously named `ZINT_CAP_EXTENDABLE`, which is
still recognised.
For example:
@@ -2681,7 +2699,7 @@ On successful encodation (after using `ZBarcode_Encode()` etc.) the `option_1`,
create the barcode. This is useful for feedback if the values were left as
defaults or were overridden by Zint.
In particular for symbologies that have masks,[^15] `option_3` will contain the
In particular for symbologies that have masks,[^16] `option_3` will contain the
mask used as `(N + 1) << 8`, N being the mask. Also Aztec Code will return the
actual ECC percentage used in `option_1` as `P << 8`, where P is the integer
percentage, the low byte containing the values given in Table {@tbl:aztec_eccs}
@@ -2698,7 +2716,7 @@ being set in `raw_seg_count` - which will always be at least one.
The `source`, `length` and `eci` members of `zint_seg` will be set accordingly -
the unconverted data in `source`, the data length in `length`, and the character
set the data was converted to in `eci`. Any check characters encoded will be
included,[^16] and for GS1 data any `FNC1` separators will be represented as
included,[^17] and for GS1 data any `FNC1` separators will be represented as
`GS` (ASCII 29) characters. UPC-A and UPC-E data will be expanded to EAN-13, as
will EAN-8 but only if it has an add-on (otherwise it will remain at 8 digits),
and any add-ons will follow the 13 digits directly (no separator). GS1 Composite
@@ -2710,16 +2728,16 @@ is `DATA_MODE`, it remains in binary; otherwise it will be in UTF-8. The UTF-8
source may be converted to the character set of the corresponding `eci` member
using the two helper functions discussed next.
[^15]: DotCode, Han Xin, Micro QR Code, QR Code and UPNQR have variable masks.
[^16]: DotCode, Han Xin, Micro QR Code, QR Code and UPNQR have variable masks.
Rectangular Micro QR Code has a fixed mask (4).
[^16]: Except for Japanese Postal Code, whose check character is not truly
[^17]: Except for Japanese Postal Code, whose check character is not truly
representable in the encoded data.
## 5.17 UTF-8 to ECI convenience functions
As a convenience the conversion done by Zint from UTF-8 to ECIs is exposed in
two helper functions (compatible with the `libzueci`[^17] functions
two helper functions (compatible with the `libzueci`[^18] functions
`zueci_utf8_to_eci()` and `zueci_dest_len_eci()`):
@@ -2739,7 +2757,7 @@ returned in `p_dest_length`, may be smaller than the estimate given by
NUL-terminated. The destination buffer is not NUL-terminated. The obsolete ECIs
0, 1 and 2 are supported.
[^17]: The library `libzueci`, which can convert both to and from UTF-8 and ECI,
[^18]: The library `libzueci`, which can convert both to and from UTF-8 and ECI,
is available at [https://sourceforge.net/projects/libzueci/](
https://sourceforge.net/projects/libzueci/).
@@ -3315,13 +3333,13 @@ alphanumerics) are not recommended.
![`zint -b CODE128AB -d "130170X178"`](images/code128ab.svg){.lin}
It is sometimes advantageous to stop Code 128 from using Code Set C which
compresses numerical data. The `BARCODE_CODE128AB`[^18] variant (symbology 60)
compresses numerical data. The `BARCODE_CODE128AB`[^19] variant (symbology 60)
suppresses Code Set C in favour of Code Sets A and B.
Note that the special extra escapes mentioned above are not available for this
variant (nor for any other).
[^18]: `BARCODE_CODE128AB` previously used the name `BARCODE_CODE128B`, which is
[^19]: `BARCODE_CODE128AB` previously used the name `BARCODE_CODE128B`, which is
still recognised.
#### 6.1.10.3 GS1-128
@@ -3917,7 +3935,7 @@ first and last digit are ignored, leaving a 4-digit DX Extract number in any
case, which must be in the range 16 to 2047. The second format `"NNN-NN"`
represents the DX Extract as two numbers separated by a dash (`-`), the first
number being 1 to 3 digits (range 1 to 127) and the second 1 to 2 digits (range
0 to 15).[^19]
0 to 15).[^20]
The optional frame number is a number in the range 0 to 63, and may have a half
frame indicator `"A"` appended. Special character sequences (with or without a
@@ -3927,7 +3945,7 @@ number 62, `"K"` or `"00"` means frame number 63, and `"F"` means frame number
A parity bit is automatically added by Zint.
[^19]: The DX number may be looked up in The (Modified) Big Film Database at
[^20]: The DX number may be looked up in The (Modified) Big Film Database at
[https://thebigfilmdatabase.merinorus.com](
https://thebigfilmdatabase.merinorus.com).