Huumus — Organic Horizon Parser¶
What is the Huumus field?¶
The Huumus (huumus) field describes the organic surface horizon of a soil
profile: its type, degree of decomposition, and thickness. Each polygon in the soil
map may contain up to four soil units (separated by spaces), and each unit may
carry readings for upper and lower depth layers (separated by /).
A simple example:
th15/h5 t₂20
This encodes two soil units:
| Unit | Notation | Meaning |
|---|---|---|
| 1 (upper) | th15 |
Toorhuumus (raw humus) in the upper layer, 15 cm thick |
| 1 (lower) | h5 |
Mineral humus horizon in the lower layer, 5 cm |
| 2 | t₂20 |
Moderately decomposed peat, 20 cm thick |
The notation system¶
Organic horizon types¶
| Notation | Type | Estonian name | Description |
|---|---|---|---|
th[depth] |
Toorhuumus | Toorhuumus | Raw/mor humus. Poorly decomposed, strongly acidic. Always > 10 cm. Characteristic of conifer forests on sandy or peaty soils. |
t[degree][depth] |
Turvas | Turvas | Peat. Decomposition degree 1 (weakly), 2 (moderately), or 3 (strongly). Always > 10 cm. |
h[depth] |
Huumus | Huumus | Mineral humus horizon (mull). Rare in the dataset. |
[degree][depth] |
Kõdu | Metsakõdu | Forest litter. Decomposition degree 1–3. Thickness ≤ 10 cm. Typical of deciduous forests. |
0 |
None | Puudub | No organic surface layer. On the right side of / (lower layer), 0 means humus with zero thickness — see worked examples. |
Decomposition degree¶
For peat (turvas) and forest litter (metsakõdu), the degree of organic matter decomposition is encoded as a subscript digit:
| Subscript | ASCII | Degree | Description |
|---|---|---|---|
| ₁ | 1 | I (weakly decomposed) | Fibric — plant structures still recognisable |
| ₂ | 2 | II (moderately decomposed) | Hemic — partially broken down |
| ₃ | 3 | III (strongly decomposed) | Sapric — highly humified, amorphous |
Depth notation¶
Depth is recorded in centimetres as a single value or a range:
| Notation | Meaning |
|---|---|
th15 |
Toorhuumus, 15 cm thick |
th15-25 |
Toorhuumus, 15 to 25 cm thick |
2₁ |
Forest litter, degree 1, 2 cm thick |
Depth represents the thickness of the horizon, not depth below the surface.
/: upper / lower depth layers¶
A / within a soil unit separates the upper depth layer (left) from the
lower depth layer (right):
th15/h5
Left of /: toorhuumus 15 cm (upper layer in the soil profile)
Right of /: mineral humus 5 cm (lower layer below)
When only one value is given (no /), it applies to the whole organic horizon.
/: agricultural-vs-forest convention (legacy)¶
A slash surrounded by spaces (/) is the legacy paper-map convention
distinguishing agricultural from forest land within the same polygon:
25-28 / th20-30
The parser detects this pattern during preprocessing (before space-splitting)
and flags the resulting unit with h_is_agri_forest = True. Inside the unit,
the rejoined expression 25-28/th20-30 is then processed via the normal /
depth-layer logic with the flag set. Values with h_depth_total > 40 cm are
also flagged for manual review — they may indicate legacy agri/forest uses of
the / convention (per Alar's guidance).
Multiple layers within a unit¶
Within one depth layer, individual organic sub-layers are joined by +:
5₁+3₂
This means: litter layer 1 (5 cm, degree 1) stacked above litter layer 2 (3 cm, degree 2). In the output columns, litter layers are ordered bottom-to-top: O1 is the lowest/most decomposed layer, O3 the highest/freshest layer.
Brackets¶
Parentheses ( ) around a unit or sub-formula mark a locally variable or uncertain
layer. The brackets are stripped before parsing.
Output fields¶
The parser returns ~96 columns per soil polygon row. Fields indexed 1–4 correspond to the up to four soil units present in the polygon.
Quality and provenance¶
| Field | Type | Description |
|---|---|---|
n_siffers |
int | Number of space-separated soil units in this polygon (0–4) |
parse_ok_h |
bool | True if every token in all units was recognised |
parse_error |
str | Semicolon-separated list of unrecognised tokens; empty on success |
Backward-compatible per-unit fields (×4)¶
Preserved from the previous parser version. These pool all depth layers together to give a single dominant type and depth envelope per unit.
| Field | Type | Description |
|---|---|---|
h_raw_n |
str | Verbatim input string for this unit, exactly as it appears in the map |
h_has_split_n |
bool | True if this unit contains a / split |
h_type_n |
str | Dominant organic layer type across all depth layers (see table below) |
h_depth_min_n |
float | Shallowest horizon boundary in cm (NaN if no depth recorded) |
h_depth_max_n |
float | Deepest horizon boundary in cm (equals h_depth_min for single-value depths) |
Values for h_type:
| Value | Meaning |
|---|---|
"th" |
Toorhuumus (raw/mor humus) |
"peat" |
Turvas (peat, any decomposition degree) |
"humus" |
Mineral humus horizon (mull) |
"litter" |
Metsakõdu (forest litter) |
"none" |
No organic layer (0) |
"mixed" |
More than one type present in this unit (priority: th > peat > humus > litter > none) |
"unknown" |
Unrecognised token — check parse_error |
"" |
Unit slot not populated (fewer than n units in this row) |
Primary humus (phu) — 3 fields × 4 units¶
From the right side of / (lower depth layer), or the whole expression if no /.
| Field | Type | Description |
|---|---|---|
h_phu_type_n |
str | Primary humus type: "th", "peat", "humus", "none", "unknown", "" |
h_phu_min_n |
float | Minimum depth in cm for primary humus |
h_phu_max_n |
float | Maximum depth in cm for primary humus |
Secondary humus (lhu) — 3 fields × 4 units¶
From the left side of / (upper depth layer), if a second humus type is present there.
| Field | Type | Description |
|---|---|---|
h_lhu_type_n |
str | Secondary humus type (same values as phu); "" when absent |
h_lhu_min_n |
float | Minimum depth in cm for secondary humus |
h_lhu_max_n |
float | Maximum depth in cm for secondary humus |
Litter layers (O1–O3) — 9 fields × 4 units¶
From the left side of / in bottom-to-top order: O1 = lowest litter layer
(most decomposed, deepest in profile), O3 = highest litter layer (freshest,
shallowest).
| Field | Type | Description |
|---|---|---|
h_o1_deg_n |
str | Decomposition degree subscript (₁, ₂, ₃) or "" for O1 |
h_o1_min_n |
float | Minimum depth in cm for O1 |
h_o1_max_n |
float | Maximum depth in cm for O1 |
h_o2_deg_n |
str | Decomposition degree for O2 |
h_o2_min_n |
float | Minimum depth in cm for O2 |
h_o2_max_n |
float | Maximum depth in cm for O2 |
h_o3_deg_n |
str | Decomposition degree for O3 |
h_o3_min_n |
float | Minimum depth in cm for O3 |
h_o3_max_n |
float | Maximum depth in cm for O3 |
Per-unit flags — 3 fields × 4 units¶
| Field | Type | Description |
|---|---|---|
h_has_depth_split_n |
bool | True when / is present — depth-layer split between upper/lower horizons |
h_is_agri_forest_n |
bool | True when / (space-slash-space) legacy agri-vs-forest pattern detected |
h_depth_total_n |
float | Sum of phu + lhu depth range midpoints in cm. Values > 40 cm flagged for manual agri/forest review. |
Structured detail¶
| Field | Type | Description |
|---|---|---|
huumus_json |
dict | Full structured parse result — per-unit, per-layer detail including decomposition degree, depth (min/max), depth-layer split, agri-forest flag, and structured phu/lhu/O1-O3 data. Stored in DB but excluded from map viewer popup. |
The huumus_json field contains the complete picture and is useful for
re-inspection, custom aggregation, or downstream modelling where the simple
summary fields are not sufficient.
Parse coverage¶
The parser resolves approximately 97 % of unique raw values in the dataset
(tested on ~31 800 unique values). Unresolved entries are collected in an
error-lookup table (huumus_error_lookup) which domain scientists populate with
corrections after each processing run.
Worked examples¶
Example 1 — simple toorhuumus:
Raw: "th15"
n_siffers=1, parse_ok=True
h_raw_1="th15", h_type_1="th"
h_depth_min_1=15.0, h_depth_max_1=15.0
h_has_split_1=False
h_phu_type_1="th", h_phu_min_1=15.0, h_phu_max_1=15.0
Example 2 — depth layers + peat:
Raw: "th15/h5 t₂20"
n_siffers=2, parse_ok=True
Unit 1: h_raw_1="th15/h5", h_has_split_1=True
phu (right/lower): h_phu_type_1="humus", h_phu_min_1=5.0, h_phu_max_1=5.0
lhu (left/upper): h_lhu_type_1="th", h_lhu_min_1=15.0, h_lhu_max_1=15.0
Unit 2: h_raw_2="t₂20", h_has_split_2=False
phu: h_phu_type_2="peat", h_phu_min_2=20.0, h_phu_max_2=20.0
Example 3 — stacked litter layers over humus:
Raw: "2₁+1₂/(5-10)"
n_siffers=1, parse_ok=True
h_raw_1="2₁+1₂/(5-10)", h_has_depth_split_1=True
phu (right/lower): h_phu_type_1="humus", h_phu_min_1=5.0, h_phu_max_1=10.0
O1 (bottom litter): h_o1_deg_1="₂", h_o1_min_1=1.0
O2 (top litter): h_o2_deg_1="₁", h_o2_min_1=2.0
Litter layer ordering
Litter layers are stored bottom-to-top: O1 is the deepest/most decomposed
layer, O3 is the shallowest/freshest. In 2₁+1₂, the 1 cm degree-2 layer
sits below the 2 cm degree-1 layer, so O1 = degree ₂ (1 cm) and
O2 = degree ₁ (2 cm).
Example 4 — litter over zero humus:
Raw: "2₁+2₂+4₃/0"
n_siffers=1, parse_ok=True
h_raw_1="2₁+2₂+4₃/0", h_has_depth_split_1=True
phu (right): h_phu_type_1="humus", h_phu_min_1=0.0, h_phu_max_1=0.0
O1 (bottom): h_o1_deg_1="₃", h_o1_min_1=4.0
O2 (middle): h_o2_deg_1="₂", h_o2_min_1=2.0
O3 (top): h_o3_deg_1="₁", h_o3_min_1=2.0
Right-side 0
When 0 appears on the right side of / (lower layer), it is classified as
humus with zero thickness (h_phu_type_1="humus", depth 0/0 cm) rather than
"none". This distinguishes the lower humus horizon from truly absent
organic matter.
Example 5 — legacy agri/forest split:
Raw: "25-28 / th20-30"
n_siffers=1, parse_ok=True
h_raw_1="25-28 / th20-30", h_is_agri_forest_1=True, h_has_depth_split_1=True
phu (right): h_phu_type_1="th", h_phu_min_1=20.0, h_phu_max_1=30.0
lhu (left): h_lhu_type_1="humus", h_lhu_min_1=25.0, h_lhu_max_1=28.0
Agri/forest flag
The / (space-slash-space) pattern sets h_is_agri_forest = True and
h_depth_total midpoints are computed. Values > 40 cm are flagged for
manual review — they may indicate the original paper-map convention where
/ separated agricultural from forest land readings within the same polygon
(per Alar's analysis of the historical mapping practice).
Example 6 — four simple humus ranges:
Raw: "18-22 20-25 20-25 20-27"
n_siffers=4, parse_ok=True
Unit 1: h_phu_type_1="humus", h_phu_min_1=18.0, h_phu_max_1=22.0
Unit 2: h_phu_type_2="humus", h_phu_min_2=20.0, h_phu_max_2=25.0
Unit 3: h_phu_type_3="humus", h_phu_min_3=20.0, h_phu_max_3=25.0
Unit 4: h_phu_type_4="humus", h_phu_min_4=20.0, h_phu_max_4=27.0