1
0
mirror of https://git.code.sf.net/p/zint/code synced 2025-12-20 03:17:12 +00:00

ECI: ECI 899 binary in UNICODE_MODE now converted from UTF-8,

not treated literally as it was before, which was inconsistent
  (literal interpretation now requires `DATA_MODE`)
This commit is contained in:
gitlost
2025-10-16 18:23:48 +01:00
parent dc4ba75eb0
commit 543696cb06
8 changed files with 267 additions and 184 deletions

View File

@@ -1366,13 +1366,17 @@ ECI Code Character Encoding Scheme (ISO/IEC 8859 schemes include ASCII)
34 UTF-32BE (High order bytes first)
35 UTF-32LE (Low order bytes first)
170 ISO/IEC 646 Invariant[^8]
899 8-bit binary data
899 8-bit binary data[^9]
Table: ECI Codes {#tbl:eci_codes}
[^8]: ISO/IEC 646 Invariant is a subset of ASCII with 12 characters undefined:
`#`, `$`, `@`, `[`, `\`, `]`, `^`, `` ` ``, `{`, `|`, `}`, `~` (tilde).
[^9]: Note that unless the `--binary` switch is used, 8-bit binary data for ECI
899 must be given as UTF-8, e.g. a byte `"\x80"` must be represented as the 2
bytes `"\xC2\x80"`; similarly `"\xC0"` as `"\xC3\x80"`, etc.
An ECI value of 0 does not encode any ECI information in the code symbol (unless
the data contains non-default character set characters). In this case, the
default character set applies (see [#tbl:default_character_sets] above).
@@ -1928,10 +1932,10 @@ int main(int argc, char **argv)
```
will print the SVG output to `stdout` (the file `"mem.svg"` is not created).
This is particularly useful for the textual formats EPS and SVG,[^9] allowing
This is particularly useful for the textual formats EPS and SVG,[^10] allowing
the output to be manipulated and processed by the client.
[^9]: BARCODE_MEMORY_FILE textual formats EPS and SVG will have Unix newlines
[^10]: BARCODE_MEMORY_FILE textual formats EPS and SVG will have Unix newlines
(LF) on both Windows and Unix, i.e. not CR+LF on Windows.
## 5.7 Setting Options
@@ -1952,7 +1956,7 @@ Member Name Type Meaning Default Value
`height` float Symbol height in Symbol dependent
X-dimensions, excluding
fixed width-to-height
symbols.[^10]
symbols.[^11]
`scale` float Scale factor for 1.0
adjusting size of image
@@ -2002,7 +2006,7 @@ Member Name Type Meaning Default Value
`.eps`, `.pcx`, `.svg`,
`.tif` or `.txt` followed
by a terminating
`NUL`.[^11]
`NUL`.[^12]
`primary` character Primary message data for `""` (empty)
string more complex symbols,
@@ -2129,13 +2133,13 @@ Member Name Type Meaning Default Value
Table: API Structure `zint_symbol` {#tbl:api_structure_zint_symbol}
[^10]: The `height` value is ignored for Aztec (including HIBC and Aztec Rune),
[^11]: The `height` value is ignored for Aztec (including HIBC and Aztec Rune),
Code One, Data Matrix (including HIBC), DotCode, Grid Matrix, Han Xin, MaxiCode,
QR Code (including HIBC, Micro QR, rMQR and UPNQR), and Ultracode - all of which
have a fixed width-to-height ratio (or, in the case of Code One, a fixed
height).
[^11]: For Windows, `outfile` is assumed to be UTF-8 encoded.
[^12]: For Windows, `outfile` is assumed to be UTF-8 encoded.
To alter these values use the syntax shown in the example below. This code has
the same result as the previous example except the output is now taller and
@@ -2301,10 +2305,10 @@ Value Effect
------------------------- ---------------------------------------------------
0 No options selected.
`BARCODE_BIND_TOP` Boundary bar above the symbol only.[^12]
`BARCODE_BIND_TOP` Boundary bar above the symbol only.[^13]
`BARCODE_BIND` Boundary bars above and below the symbol and
between rows if stacking multiple symbols.[^13]
between rows if stacking multiple symbols.[^14]
`BARCODE_BOX` Add a box surrounding the symbol and whitespace.
@@ -2331,7 +2335,7 @@ Value Effect
Symbols in Memory (raster)].
`BARCODE_QUIET_ZONES` Add compliant quiet zones (additional to any
specified whitespace).[^14]
specified whitespace).[^15]
`BARCODE_NO_QUIET_ZONES` Disable quiet zones, notably those with defaults.
@@ -2353,13 +2357,13 @@ Value Effect
Table: API `output_options` Values {#tbl:api_output_options}
[^12]: The `BARCODE_BIND_TOP` flag is set by default for DPD - see [6.1.10.7 DPD
[^13]: The `BARCODE_BIND_TOP` flag is set by default for DPD - see [6.1.10.7 DPD
Code].
[^13]: The `BARCODE_BIND` flag is always set for Codablock-F, Code 16K and Code
[^14]: The `BARCODE_BIND` flag is always set for Codablock-F, Code 16K and Code
49. Special considerations apply to ITF-14 - see [6.1.2.6 ITF-14].
[^14]: Codablock-F, Code 16K, Code 49, EAN-13, EAN-8, EAN/UPC add-ons, ISBN,
[^15]: Codablock-F, Code 16K, Code 49, EAN-13, EAN-8, EAN/UPC add-ons, ISBN,
ITF-14, UPC-A and UPC-E have compliant quiet zones added by default.
## 5.11 Setting the Input Mode
@@ -2625,7 +2629,7 @@ Value Meaning
`ZINT_CAP_STACKABLE` Is the symbology stackable? Note that stacked
symbologies are not stackable.
`ZINT_CAP_EANUPC`[^15] Is the symbology EAN/UPC?
`ZINT_CAP_EANUPC`[^16] Is the symbology EAN/UPC?
`ZINT_CAP_COMPOSITE` Does the symbology support composite data? (see
[6.3 GS1 Composite Symbols (ISO 24723)] below)
@@ -2661,7 +2665,7 @@ Value Meaning
Table: API Capability Flags {#tbl:api_cap}
[^15]: `ZINT_CAP_EANUPC` was previously named `ZINT_CAP_EXTENDABLE`, which is
[^16]: `ZINT_CAP_EANUPC` was previously named `ZINT_CAP_EXTENDABLE`, which is
still recognised.
For example:
@@ -2688,7 +2692,7 @@ On successful encodation (after using `ZBarcode_Encode()` etc.) the `option_1`,
create the barcode. This is useful for feedback if the values were left as
defaults or were overridden by Zint.
In particular for symbologies that have masks,[^16] `option_3` will contain the
In particular for symbologies that have masks,[^17] `option_3` will contain the
mask used as `(N + 1) << 8`, N being the mask. Also Aztec Code will return the
actual ECC percentage used in `option_1` as `P << 8`, where P is the integer
percentage, the low byte containing the values given in [#tbl:aztec_eccs] (with
@@ -2705,7 +2709,7 @@ being set in `raw_seg_count` - which will always be at least one.
The `source`, `length` and `eci` members of `zint_seg` will be set accordingly -
the unconverted data in `source`, the data length in `length`, and the character
set the data was converted to in `eci`. Any check characters encoded will be
included,[^17] and for GS1 data any `FNC1` separators will be represented as
included,[^18] and for GS1 data any `FNC1` separators will be represented as
`GS` (ASCII 29) characters. UPC-A and UPC-E data will be expanded to EAN-13, as
will EAN-8 but only if it has an add-on (otherwise it will remain at 8 digits),
and any add-ons will follow the 13 digits directly (no separator). GS1 Composite
@@ -2717,16 +2721,16 @@ is `DATA_MODE`, it remains in binary; otherwise it will be in UTF-8. The UTF-8
source may be converted to the character set of the corresponding `eci` member
using the two helper functions discussed next.
[^16]: DotCode, Han Xin, Micro QR Code, QR Code and UPNQR have variable masks.
[^17]: DotCode, Han Xin, Micro QR Code, QR Code and UPNQR have variable masks.
Rectangular Micro QR Code has a fixed mask (4).
[^17]: Except for Japanese Postal Code, whose check character is not truly
[^18]: Except for Japanese Postal Code, whose check character is not truly
representable in the encoded data.
## 5.17 UTF-8 to ECI convenience functions
As a convenience the conversion done by Zint from UTF-8 to ECIs is exposed in
two helper functions (compatible with the `libzueci`[^18] functions
two helper functions (compatible with the `libzueci`[^19] functions
`zueci_utf8_to_eci()` and `zueci_dest_len_eci()`):
@@ -2746,7 +2750,7 @@ returned in `p_dest_length`, may be smaller than the estimate given by
NUL-terminated. The destination buffer is not NUL-terminated. The obsolete ECIs
0, 1 and 2 are supported.
[^18]: The library `libzueci`, which can convert both to and from UTF-8 and ECI,
[^19]: The library `libzueci`, which can convert both to and from UTF-8 and ECI,
is available at [https://sourceforge.net/projects/libzueci/](
https://sourceforge.net/projects/libzueci/).
@@ -3351,7 +3355,7 @@ alphanumerics) are not recommended.
#### 6.1.10.2 Code 128 Suppress Code Set C (Code Sets A and B only)
It is sometimes advantageous to stop Code 128 from using Code Set C which
compresses numerical data. The `BARCODE_CODE128AB`[^19] variant (symbology 60)
compresses numerical data. The `BARCODE_CODE128AB`[^20] variant (symbology 60)
suppresses Code Set C in favour of Code Sets A and B.
![`zint -b CODE128AB -d "130170X178"`](images/code128ab.svg){.lin}
@@ -3359,7 +3363,7 @@ suppresses Code Set C in favour of Code Sets A and B.
Note that the special extra escapes mentioned above are not available for this
variant (nor for any other).
[^19]: `BARCODE_CODE128AB` previously used the name `BARCODE_CODE128B`, which is
[^20]: `BARCODE_CODE128AB` previously used the name `BARCODE_CODE128B`, which is
still recognised.
#### 6.1.10.3 GS1-128
@@ -3965,7 +3969,7 @@ first and last digit are ignored, leaving a 4-digit DX Extract number in any
case, which must be in the range 16 to 2047. The second format `"NNN-NN"`
represents the DX Extract as two numbers separated by a dash (`-`), the first
number being 1 to 3 digits (range 1 to 127) and the second 1 to 2 digits (range
0 to 15).[^20]
0 to 15).[^21]
The optional frame number is a number in the range 0 to 63, and may have a half
frame indicator `"A"` appended. Special character sequences (with or without a
@@ -3975,7 +3979,7 @@ number 62, `"K"` or `"00"` means frame number 63, and `"F"` means frame number
A parity bit is automatically added by Zint.
[^20]: The DX number may be looked up in The (Modified) Big Film Database at
[^21]: The DX number may be looked up in The (Modified) Big Film Database at
[https://thebigfilmdatabase.merinorus.com](
https://thebigfilmdatabase.merinorus.com).