Heritage Data Processor CLI: create Command Guide¶
Overview¶
The create command is the primary CLI tool for initializing new Heritage Data Processor (HDP) projects. It performs a complete, atomic project setup in a single operation, creating the .hdpc database file, scanning source files, validating them, and preparing them for further processing.
This guide focuses on 3D model workflows, which is the most common use case for heritage digitization projects.
Table of Contents¶
- Basic Concepts
- Command Syntax
- Required Arguments
- Batch Entity Modes
- Bundling Strategies
- 3D Model-Specific Options
- Practical Examples
- Troubleshooting
Basic Concepts¶
What is a .hdpc Project?¶
An .hdpc file is a SQLite database that contains:
- Project metadata: Name, short code, timestamps
- File inventory: Scanned source files with hierarchical relationships
- Configuration: Paths, modality templates, scan options
- Metadata mappings: Field mappings for publication workflows
Modality Templates¶
Modality templates are predefined configurations that specify: - Which file extensions to scan initially - Default scan behaviors for that data type - Common validation rules
For 3D models, the default modality is "3D Model", which includes extensions like .obj, .mtl, .glb, .gltf, .fbx, .ply, and .stl.
Batch Entity Modes¶
The batch entity mode determines how files are grouped into records (publishable units):
| Mode | Behavior | Use Case |
|---|---|---|
| root | Each file in the root directory becomes a separate record | Individual artifacts with no subfolder organization |
| subdirectory | Each subdirectory becomes one record, containing all its files | Collections organized by artifact/site in folders |
| hybrid | Combines both: root files as separate records + subdirectories as grouped records | Mixed datasets with both standalone files and organized collections |
Command Syntax¶
python main.py create \\
--hdpc-path <path_to_hdpc_file> \\
--project-name <descriptive_name> \\
--short-code <unique_code> \\
--input-dir <source_directory> \\
--output-dir <output_directory> \\
[OPTIONS]
Required Arguments¶
--hdpc-path¶
Path where the .hdpc project file will be created.
- Must end with
.hdpcextension - Parent directory must exist
- File will be created by the command (must not already exist)
Example:
--project-name¶
Descriptive human-readable name for the project.
- Can contain spaces and special characters
- Will be used in reports and publication metadata
- Should be meaningful and descriptive
Example:
--short-code¶
Short unique identifier for the project.
- Typically alphanumeric, no spaces
- Used for internal references and file naming
- Should be concise (e.g., institution code + year)
Example:
--input-dir¶
Directory containing the source data files to scan.
- Must be an existing directory
- All files matching the modality extensions will be scanned
- Subdirectories are scanned based on
--batch-entitymode
Example:
--output-dir¶
Directory where processed outputs will be stored.
- Will be created if it doesn't exist
- Used for Zenodo uploads, derivatives, and exports
- Should be separate from input directory
Example:
Batch Entity Modes¶
Mode 1: root (Default)¶
Behavior: Each file in the root level of --input-dir becomes a separate record.
When to use: - Individual artifact scans stored as separate files - No subfolder organization - Each 3D model represents a distinct publishable item
Example directory structure:
test_data/root_mode_examples/no_bundling/
├── artifact_photo.png → Record 1
├── building_scan.glb → Record 2
├── statue_model.obj → Record 3
└── terrain_data.fbx → Record 4
Command:
python main.py create \\
--hdpc-path ./projects/individual_artifacts.hdpc \\
--project-name "Individual Artifacts" \\
--short-code "INDART2025" \\
--input-dir ./test_data/root_mode_examples/no_bundling \\
--output-dir ./output/individual \\
--batch-entity root
Result: 4 separate records, one per file.
Mode 2: subdirectory¶
Behavior: Each subdirectory in --input-dir becomes one record, containing all files within it.
When to use: - Files are pre-organized into folders by artifact/site - Each folder represents one publishable unit - Multiple related files (OBJ + MTL + textures) belong together
Example directory structure:
test_data/subdirectory_mode_examples/
├── archaelogical_site_001/ → Record 1
│ ├── excavation_photo_001.png
│ ├── excavation_photo_002.png
│ ├── site_overview.obj
│ ├── site_overview.mtl
│ └── textures/
│ ├── stone_texture.jpg
│ └── wood_normal.jpg
├── archaelogical_site_002/ → Record 2
│ ├── artifact_scan.glb
│ ├── context_photo.png
│ └── documentation.pdf
└── museum_collection_item_045/ → Record 3
├── detail_scan_001.obj
├── detail_scan_001.mtl
├── main_model.fbx
└── reference_images/
├── front_view.jpg
└── side_view.jpg
Command:
python main.py create \\
--hdpc-path ./projects/site_collections.hdpc \\
--project-name "Archaeological Site Collections" \\
--short-code "ASC2025" \\
--input-dir ./test_data/subdirectory_mode_examples \\
--output-dir ./output/sites \\
--batch-entity subdirectory
Result: 3 records (one per subdirectory), each containing multiple files.
Mode 3: hybrid¶
Behavior: Combines both approaches: - Files in the root directory become individual records - Each subdirectory becomes one grouped record
When to use: - Mixed dataset with both standalone artifacts and collections - Flexibility for differently organized data
Example directory structure:
test_data/hybrid_mode_examples/
├── standalone_artifact_001.obj → Record 1 (standalone)
├── standalone_artifact_001.mtl
├── standalone_reference_photo.png → Record 2 (standalone)
├── excavation_batch_alpha/ → Record 3 (grouped)
│ ├── fragment_001.obj
│ ├── fragment_002.obj
│ ├── fragment_003.obj
│ └── shared_textures/
│ ├── clay_texture.jpg
│ └── weathering_normal.jpg
└── excavation_batch_beta/ → Record 4 (grouped)
├── complete_vessel.glb
├── vessel_fragments.fbx
└── documentation.txt
Command:
python main.py create \\
--hdpc-path ./projects/mixed_collection.hdpc \\
--project-name "Mixed Artifact Collection" \\
--short-code "MAC2025" \\
--input-dir ./test_data/hybrid_mode_examples \\
--output-dir ./output/mixed \\
--batch-entity hybrid
Result: 4 records: 2 standalone + 2 grouped subdirectories.
Bundling Strategies¶
Bundling determines how related files are grouped together within a record. This is especially important for 3D models, where one logical artifact might consist of multiple files (e.g., model.obj + model.mtl + textures).
Key Concepts¶
- Primary Source File: The main file (e.g.,
.objfile) - Associated Files: Supporting files (MTL, textures, etc.)
- Bundling Strategy: The rule for determining which files belong together
Strategy 1: stem (Default)¶
Rule: Files with the exact same base filename (stem) are bundled together.
Example:
temple_column.obj ← Primary
temple_column.mtl ← Associated (same stem)
temple_column.jpg ← Associated (same stem)
ceramic_bowl.fbx ← Separate bundle
When to use: - Standard naming convention where related files share the same name - Most common scenario for 3D models exported from software
Command:
python main.py create \\
--hdpc-path ./projects/stem_bundle.hdpc \\
--project-name "Stem Bundling Example" \\
--short-code "STEM2025" \\
--input-dir ./test_data/root_mode_examples/stem_bundling \\
--output-dir ./output/stem \\
--batch-entity root \\
--enable-bundling \\
--bundling-strategy stem
Result: Files grouped by identical stems.
Strategy 2: pattern¶
Rule: Files matching a regex pattern are bundled together by extracting a common identifier.
Example pattern: (.+?)_Section_[A-Z] matches files like:
Athens_Temple_Section_A.obj → Bundle: "Athens_Temple"
Athens_Temple_Section_B.obj → Bundle: "Athens_Temple"
Athens_Temple_materials.mtl → Bundle: "Athens_Temple"
Rome_Forum_Column_01.obj → Bundle: "Rome_Forum"
Rome_Forum_Column_02.obj → Bundle: "Rome_Forum"
Rome_Forum_materials.mtl → Bundle: "Rome_Forum"
When to use: - Complex naming schemes with prefixes/suffixes - Multi-part models (e.g., large building sections) - Need custom grouping logic
Command:
python main.py create \\
--hdpc-path ./projects/pattern_bundle.hdpc \\
--project-name "Pattern Bundling Example" \\
--short-code "PATT2025" \\
--input-dir ./test_data/bundling_strategies/pattern_matching \\
--output-dir ./output/pattern \\
--batch-entity root \\
--enable-bundling \\
--bundling-strategy pattern \\
--bundling-pattern "(.+?)_(Section|Column)_[A-Z0-9]+"
Result: Files grouped by extracted identifier from regex pattern.
Strategy 3: prefix_suffix¶
Rule: Remove specified prefix and/or suffix patterns from filenames, then bundle by resulting core name.
Example with prefix v\\d+_ and suffix _(hiRes|lowRes):
v1_pottery_fragment.obj → Core: "pottery_fragment"
v2_pottery_fragment.obj → Core: "pottery_fragment"
v3_pottery_fragment.obj → Core: "pottery_fragment"
pottery_fragment.mtl → Core: "pottery_fragment"
n001_vase_hiRes.obj → Core: "vase"
n001_vase_lowRes.obj → Core: "vase"
n001_vase.mtl → Core: "vase"
When to use: - Version numbers at the start of filenames - Resolution indicators (hiRes/lowRes) - Inventory prefixes (n001_, n002_)
Command (prefix removal):
python main.py create \\
--hdpc-path ./projects/prefix_bundle.hdpc \\
--project-name "Prefix Removal Example" \\
--short-code "PREF2025" \\
--input-dir ./test_data/bundling_strategies/prefix_removal \\
--output-dir ./output/prefix \\
--batch-entity root \\
--enable-bundling \\
--bundling-strategy prefix_suffix \\
--bundling-prefix "v\\d+_"
Command (suffix removal):
python main.py create \\
--hdpc-path ./projects/suffix_bundle.hdpc \\
--project-name "Suffix Removal Example" \\
--short-code "SUFF2025" \\
--input-dir ./test_data/bundling_strategies/suffix_removal \\
--output-dir ./output/suffix \\
--batch-entity root \\
--enable-bundling \\
--bundling-strategy prefix_suffix \\
--bundling-suffix "_(edge|obverse|reverse)_scan"
Result: Files grouped after removing prefix/suffix patterns.
Strategy 4: core_identifier¶
Rule: Extract a core identifier from filenames using a specific pattern (e.g., site042, site093), ignoring descriptive suffixes.
Example:
site042_structure_detail.obj → Bundle: "site042"
site042_structure_main.fbx → Bundle: "site042"
site093_artifact_complete.glb → Bundle: "site093"
site093_artifact_fragment.obj → Bundle: "site093"
When to use: - Site/excavation number prefixes - Catalog identifiers embedded in filenames - Need to group by administrative code
Command:
python main.py create \\
--hdpc-path ./projects/core_id_bundle.hdpc \\
--project-name "Core Identifier Example" \\
--short-code "CORE2025" \\
--input-dir ./test_data/bundling_strategies/core_identifier \\
--output-dir ./output/core_id \\
--batch-entity root \\
--enable-bundling \\
--bundling-strategy core_identifier
Result: Files grouped by extracted core identifier (e.g., site042).
3D Model-Specific Options¶
--add-mtl / --no-add-mtl¶
Purpose: Automatically scan for and include .mtl (material) files associated with .obj files.
Default: True (enabled)
Behavior:
- When an .obj file is found, the scanner looks for a corresponding .mtl file with the same stem
- Example: roman_statue.obj → scans for roman_statue.mtl
When to disable: - OBJ files don't use materials - MTL files are stored separately or managed differently
Command (enabled):
python main.py create \\
--hdpc-path ./projects/with_mtl.hdpc \\
--project-name "OBJ with Materials" \\
--short-code "MTL2025" \\
--input-dir ./test_data/complex_dependencies/obj_with_dependencies \\
--output-dir ./output/with_mtl \\
--add-mtl
Command (disabled):
python main.py create \\
--hdpc-path ./projects/no_mtl.hdpc \\
--project-name "OBJ without Materials" \\
--short-code "NOMTL2025" \\
--input-dir ./test_data/root_mode_examples/no_bundling \\
--output-dir ./output/no_mtl \\
--no-add-mtl
--add-textures / --no-add-textures¶
Purpose: Automatically scan for and include texture image files referenced in .mtl files.
Default: True (enabled)
Behavior:
- Parses .mtl files to extract texture references (e.g., map_Kd marble_diffuse.jpg)
- Searches for these texture files in:
- Same directory as the MTL file
- Subdirectories (e.g., textures/, Materials/)
- Additional paths specified with --texture-paths
Supported texture maps:
- Diffuse: map_Kd
- Specular: map_Ks
- Normal: map_Bump, bump
- Roughness: map_Ns
- Ambient: map_Ka
When to disable: - Textures are stored in a separate archive - Texture files are too large or not needed for publication
Command (enabled with custom search paths):
python main.py create \\
--hdpc-path ./projects/with_textures.hdpc \\
--project-name "OBJ with Textures" \\
--short-code "TEX2025" \\
--input-dir ./test_data/complex_dependencies/obj_with_dependencies \\
--output-dir ./output/with_textures \\
--add-textures \\
--texture-paths ./additional_textures ./shared_materials
--archive-textures¶
Purpose: Archive texture subdirectories into ZIP files for more efficient storage and upload.
Default: False (disabled)
Behavior:
- When a texture subdirectory is detected (e.g., textures/, Materials/), it's compressed into a .zip archive
- The archive is included as a child file in the hierarchy
- Original texture files are still tracked but archived
When to use: - Large texture collections (many small files) - Zenodo uploads (fewer files = faster uploads) - Organized texture folders
Example directory:
archaelogical_site_001/
├── site_overview.obj
├── site_overview.mtl
└── textures/ ← Will be archived to textures.zip
├── stone_texture.jpg
├── wood_normal.jpg
└── roof.jpg
Command:
python main.py create \\
--hdpc-path ./projects/archived_textures.hdpc \\
--project-name "Archived Textures Example" \\
--short-code "ARCH2025" \\
--input-dir ./test_data/subdirectory_mode_examples/archaelogical_site_001 \\
--output-dir ./output/archived \\
--batch-entity subdirectory \\
--add-textures \\
--archive-textures
Result: textures/ directory archived to textures.zip in the file hierarchy.
--texture-paths¶
Purpose: Specify additional directories to search for texture files.
Default: None (only searches in model directory)
Behavior: - When texture files are not found in the default locations, the scanner searches these additional paths - Useful for shared texture libraries or centralized material repositories
Command:
python main.py create \\
--hdpc-path ./projects/shared_textures.hdpc \\
--project-name "Shared Texture Library" \\
--short-code "SHARE2025" \\
--input-dir ./models \\
--output-dir ./output/shared \\
--add-textures \\
--texture-paths ./common_textures ./material_library
Practical Examples¶
Example 1: Simple Individual Artifacts (Root Mode)¶
Scenario: 4 individual artifact scans, each as separate files, no bundling needed.
Directory:
root_mode_examples/no_bundling/
├── artifact_photo.png
├── building_scan.glb
├── statue_model.obj
└── terrain_data.fbx
Command:
python main.py create \\
--hdpc-path ./projects/simple_artifacts.hdpc \\
--project-name "Simple Individual Artifacts" \\
--short-code "SIA2025" \\
--input-dir ./test_data/root_mode_examples/no_bundling \\
--output-dir ./output/simple_artifacts \\
--batch-entity root
Expected Result: - 4 separate records created - Each file becomes its own publishable unit - No bundling or complex dependencies
Example 2: OBJ Models with MTL and Textures (Stem Bundling)¶
Scenario: Standard 3D model workflow with OBJ files, their MTL materials, and texture images.
Directory:
root_mode_examples/stem_bundling/
├── temple_column.obj
├── temple_column.mtl
├── temple_column.jpg
├── ceramic_bowl.fbx
└── ancient_tablet.glb
Command:
python main.py create \\
--hdpc-path ./projects/obj_with_materials.hdpc \\
--project-name "OBJ Models with Materials" \\
--short-code "OBJMAT2025" \\
--input-dir ./test_data/root_mode_examples/stem_bundling \\
--output-dir ./output/obj_materials \\
--batch-entity root \\
--enable-bundling \\
--bundling-strategy stem \\
--add-mtl \\
--add-textures
Expected Result:
- Record 1: temple_column (OBJ + MTL + JPG bundled together)
- Record 2: ceramic_bowl (FBX standalone)
- Record 3: ancient_tablet (GLB standalone)
Example 3: Complex OBJ with Multiple Textures (Subdirectory Mode)¶
Scenario: One artifact per subdirectory, with texture files in a separate subfolder.
Directory:
subdirectory_mode_examples/archaelogical_site_001/
├── site_overview.obj
├── site_overview.mtl
├── excavation_photo_001.png
├── excavation_photo_002.png
└── textures/
├── stone_texture.jpg
├── wood_normal.jpg
└── roof.jpg
Command:
python main.py create \\
--hdpc-path ./projects/complex_site.hdpc \\
--project-name "Archaeological Site with Textures" \\
--short-code "ASTEXT2025" \\
--input-dir ./test_data/subdirectory_mode_examples \\
--output-dir ./output/complex_site \\
--batch-entity subdirectory \\
--add-mtl \\
--add-textures \\
--archive-textures
Expected Result:
- Record 1: archaelogical_site_001 containing:
- Primary: site_overview.obj
- Associated: site_overview.mtl
- Associated: excavation_photo_001.png, excavation_photo_002.png
- Archived: textures.zip (containing all 3 texture files)
Example 4: Pattern Bundling for Multi-Part Models¶
Scenario: Large building model split into multiple sections (Section A, Section B), sharing common materials.
Directory:
bundling_strategies/pattern_matching/
├── Athens_Temple_Section_A.obj
├── Athens_Temple_Section_B.obj
├── Athens_Temple_materials.mtl
├── Rome_Forum_Column_01.obj
├── Rome_Forum_Column_02.obj
└── Rome_Forum_materials.mtl
Command:
python main.py create \\
--hdpc-path ./projects/multipart_models.hdpc \\
--project-name "Multi-Part Building Models" \\
--short-code "MULTI2025" \\
--input-dir ./test_data/bundling_strategies/pattern_matching \\
--output-dir ./output/multipart \\
--batch-entity root \\
--enable-bundling \\
--bundling-strategy pattern \\
--bundling-pattern "(.+?)_(Section|Column)_[A-Z0-9]+"
Expected Result:
- Record 1: Athens_Temple (Section A OBJ + Section B OBJ + materials MTL)
- Record 2: Rome_Forum (Column 01 OBJ + Column 02 OBJ + materials MTL)
Example 5: Version Control (Prefix Removal)¶
Scenario: Multiple scan versions of the same artifact, with version prefixes v1_, v2_, v3_.
Directory:
bundling_strategies/prefix_removal/
├── v1_pottery_fragment.obj
├── v2_pottery_fragment.obj
├── v3_pottery_fragment.obj
└── pottery_fragment.mtl
Command:
python main.py create \\
--hdpc-path ./projects/versioned_scans.hdpc \\
--project-name "Versioned Pottery Scans" \\
--short-code "VERS2025" \\
--input-dir ./test_data/bundling_strategies/prefix_removal \\
--output-dir ./output/versioned \\
--batch-entity root \\
--enable-bundling \\
--bundling-strategy prefix_suffix \\
--bundling-prefix "v\\d+_"
Expected Result:
- Record 1: pottery_fragment (all 3 OBJ versions + shared MTL bundled together)
Example 6: High/Low Resolution Models (Suffix Removal)¶
Scenario: Each artifact has high-resolution and low-resolution variants.
Directory:
bundling_strategies/suffix_removal/
├── coin_obverse_scan.obj
├── coin_reverse_scan.obj
├── coin_edge_scan.obj
└── coin.mtl
Command:
python main.py create \\
--hdpc-path ./projects/multiview_scans.hdpc \\
--project-name "Multi-View Coin Scans" \\
--short-code "COIN2025" \\
--input-dir ./test_data/bundling_strategies/suffix_removal \\
--output-dir ./output/multiview \\
--batch-entity root \\
--enable-bundling \\
--bundling-strategy prefix_suffix \\
--bundling-suffix "_(obverse|reverse|edge)_scan"
Expected Result:
- Record 1: coin (obverse + reverse + edge OBJ scans + shared MTL)
Example 7: Hybrid Mode (Mixed Organization)¶
Scenario: Root-level standalone artifacts + subdirectories for grouped collections.
Directory:
hybrid_mode_examples/
├── standalone_artifact_001.obj
├── standalone_artifact_001.mtl
├── standalone_reference_photo.png
├── excavation_batch_alpha/
│ ├── fragment_001.obj
│ ├── fragment_002.obj
│ ├── fragment_003.obj
│ └── shared_textures/
│ ├── clay_texture.jpg
│ └── weathering_normal.jpg
└── excavation_batch_beta/
├── complete_vessel.glb
├── vessel_fragments.fbx
└── documentation.txt
Command:
python main.py create \\
--hdpc-path ./projects/hybrid_collection.hdpc \\
--project-name "Hybrid Artifact Collection" \\
--short-code "HYB2025" \\
--input-dir ./test_data/hybrid_mode_examples \\
--output-dir ./output/hybrid \\
--batch-entity hybrid \\
--enable-bundling \\
--bundling-strategy stem \\
--add-mtl \\
--add-textures \\
--archive-textures
Expected Result:
- Record 1: standalone_artifact_001 (OBJ + MTL bundled)
- Record 2: standalone_reference_photo (standalone PNG)
- Record 3: excavation_batch_alpha (3 OBJ fragments + archived textures)
- Record 4: excavation_batch_beta (GLB + FBX + TXT)
Troubleshooting¶
Issue: "No files found matching extensions"¶
Cause: The specified extensions don't match any files in the input directory.
Solution:
1. Verify --input-dir path is correct
2. Check that files have the expected extensions (e.g., .obj, .glb)
3. Manually specify extensions with --extensions .obj .mtl .png
Example:
python main.py create \\
--hdpc-path ./projects/custom_ext.hdpc \\
--project-name "Custom Extensions" \\
--short-code "CUST2025" \\
--input-dir ./my_data \\
--output-dir ./output \\
--extensions .obj .mtl .fbx .glb .png .jpg
Issue: "MTL file not found for OBJ"¶
Cause: OBJ file exists but corresponding MTL file is missing.
Solution:
1. Verify MTL file has the same stem as OBJ (e.g., model.obj → model.mtl)
2. Check if MTL file is in a different directory
3. Use --no-add-mtl if materials are not needed
Issue: "Texture files not found"¶
Cause: MTL file references textures that don't exist or are in a different location.
Solution:
1. Check MTL file contents for texture paths
2. Ensure texture files are in the same directory or subdirectory
3. Use --texture-paths to specify additional search locations:
python main.py create \\
--hdpc-path ./projects/missing_tex.hdpc \\
--project-name "Missing Textures Fix" \\
--short-code "TEX2025" \\
--input-dir ./models \\
--output-dir ./output \\
--add-textures \\
--texture-paths ./external_textures ./shared_materials
Issue: "Files not bundled correctly"¶
Cause: Bundling strategy doesn't match the filename convention.
Solution:
1. Verify filenames match the expected pattern
2. Try different bundling strategies:
- stem: Exact filename match
- pattern: Custom regex pattern
- prefix_suffix: Remove prefixes/suffixes
3. Test pattern with --bundling-pattern flag
Example (debugging pattern):
# Test pattern bundling with verbose output
python main.py create \\
--hdpc-path ./projects/test_pattern.hdpc \\
--project-name "Test Pattern Bundling" \\
--short-code "TESTPATT" \\
--input-dir ./test_data \\
--output-dir ./output \\
--enable-bundling \\
--bundling-strategy pattern \\
--bundling-pattern "(.+?)_Section_[A-Z]"
Issue: "Too many small files in record"¶
Cause: Large texture directories creating many individual file entries.
Solution:
Use --archive-textures to compress texture folders:
python main.py create \\
--hdpc-path ./projects/archived.hdpc \\
--project-name "Archived Texture Folders" \\
--short-code "ARCH2025" \\
--input-dir ./large_textures \\
--output-dir ./output \\
--add-textures \\
--archive-textures
Issue: "Project already exists"¶
Cause: .hdpc file already exists at the specified path.
Solution:
1. Delete or rename the existing .hdpc file
2. Choose a different --hdpc-path
3. Use the existing project with other commands (e.g., upload, process)
Advanced Tips¶
Tip 1: Test with Small Dataset First¶
Before processing large collections, test with a small subset:
# Create test subdirectory with a few files
mkdir -p ./test_subset
cp -r ./test_data/subdirectory_mode_examples/archaelogical_site_001 ./test_subset/
# Test command
python main.py create \\
--hdpc-path ./projects/test.hdpc \\
--project-name "Test Run" \\
--short-code "TEST" \\
--input-dir ./test_subset \\
--output-dir ./output/test \\
--batch-entity subdirectory
Tip 2: Use Absolute Paths for Clarity¶
Relative paths can be confusing. Use absolute paths for production:
python main.py create \\
--hdpc-path /full/path/to/projects/production.hdpc \\
--project-name "Production Collection" \\
--short-code "PROD2025" \\
--input-dir /full/path/to/source_data \\
--output-dir /full/path/to/output
Tip 3: Organize Output by Project¶
Create dedicated output directories for each project:
mkdir -p ./output/museum_pottery
python main.py create \\
--hdpc-path ./projects/museum_pottery.hdpc \\
--project-name "Museum Pottery Collection" \\
--short-code "MPC2025" \\
--input-dir ./source/pottery \\
--output-dir ./output/museum_pottery
Tip 4: Document Your Bundling Strategy¶
Save your bundling configuration in a README:
# bundling_config.txt
Project: Archaeological Site Scans
Strategy: pattern
Pattern: (.+?)_Section_[A-Z0-9]+
Reasoning: Large building models split into lettered sections
Summary¶
The create command provides a powerful, flexible way to initialize Heritage Data Processor projects with comprehensive file scanning and validation. Key takeaways:
- Batch Entity Mode determines how files are grouped into records
- Bundling Strategy controls how related files are associated within records
- 3D Model Options enable automatic MTL and texture scanning
- Test with small datasets before processing large collections
- Choose bundling strategy based on your filename conventions
For further assistance, consult the test data examples or contact the development team.