Installation Script Reference¶
The Heritage Data Processor Installation Script (install_hdp.sh) automates the complete installation process, including repository cloning, version selection, Python environment setup, and Node.js dependency management.
Overview¶
This script provides a guided installation experience with comprehensive system validation, interactive version selection, and error recovery capabilities.
Usage¶
Basic Usage¶
Launches interactive installation with default settings: installs to ./heritage-data-processor and prompts for version selection.
Command-Line Options¶
Options:
--dir DIRECTORY: Install to specific directory. Defaults to./heritage-data-processor--version VERSION: Install specific version tag without interactive selection--non-interactive: Skip interactive prompts, use defaults (useful for CI/CD)--help: Display usage information and exit
Usage Examples¶
Example 1: Interactive Installation¶
Prompts for version selection and installs to default directory.
Example 2: Install Specific Version¶
Installs version 1.2.0 without version selection prompt.
Example 3: Custom Installation Directory¶
Installs to /opt/hdp instead of default location.
Example 4: Automated Installation¶
Fully automated installation for CI/CD pipelines.
Requirements¶
Required Tools¶
git: Version control for repository cloninguv: Python package installer and virtual environment managernpm: Node.js package manager (comes with Node.js)
Recommended Versions¶
- Python: 3.11+ (required for project compatibility)
- Node.js: 16+ (recommended for modern package support)
System Resources¶
- Disk Space: Minimum 1000 MB free space
- Network: Internet connectivity required for GitHub access and package downloads
Installation Steps¶
Step 1: System Validation¶
Validates system prerequisites before beginning installation.
Checks Performed:
- Tool Availability: Verifies
git,uv, andnpmare installed and accessible - Version Compatibility: Checks Python (3.8+ minimum, 3.11+ recommended) and Node.js (16+ recommended) versions
- Disk Space: Ensures at least 1000 MB available
- Directory Conflict: Verifies installation directory does not already exist
- Network Connectivity: Tests connection to GitHub
Exit Conditions:
If any required tool is missing, the script provides installation instructions and exits with error code 1.
Example Output:
[INFO] Step 1: System Validation
[DEBUG] git: git version 2.39.0
[DEBUG] uv: uv 0.1.20
[DEBUG] npm: 9.6.7
[DEBUG] Python version: 3.11.5
[DEBUG] Node.js version: 20.5.0
[DEBUG] Available disk space: 15234MB
✅ All required tools are installed
✅ Network connectivity verified
✅ Pre-flight checks completed
Step 2: Version Selection¶
Fetches available versions from repository and prompts user for selection.
Interactive Mode:
Displays up to 15 most recent tagged versions with commit dates and messages:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Available Versions
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
0) Latest (main branch) - Most recent development version
1) v1.2.0 2025-10-15 Release: Enhanced validation
2) v1.1.5 2025-09-20 Bugfix: Pipeline execution
3) v1.1.4 2025-09-10 Feature: Batch processing
... and 42 more versions
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Select version to install (0-15, default: 0):
Non-Interactive Mode:
Automatically selects the latest tagged version without prompting.
Command-Line Version:
When --version TAG is specified, validates the tag exists in the repository.
Step 3: Cloning Repository¶
Clones the Heritage Data Processor repository from GitHub.
Repository URL:
Process:
- Clone: Executes
git cloneto download repository - Navigate: Changes working directory to cloned repository
- Checkout: If a specific version was selected, checks out that tag
- Verify: Displays current commit information
Output Logging:
All git operations are logged to both console and log file for troubleshooting.
Error Recovery:
If cloning fails, provides diagnostic information about possible causes (network issues, access denied, invalid URL).
Step 4: Python Environment Setup¶
Creates isolated Python virtual environment and installs dependencies.
Virtual Environment:
- Location:
.venvdirectory in project root - Python Version: 3.11 (enforced)
- Tool: Uses
uv venv --python=3.11
Dependency Installation:
Installs packages from requirements.txt using uv pip install:
Python Version Enforcement:
If Python 3.11 is not available, the script exits with installation instructions for the user's platform:
- Ubuntu/Debian:
sudo apt install python3.11 - macOS:
brew install python@3.11 - Windows: Download from python.org
Verification:
After installation, counts and reports the number of installed packages.
Step 5: Node.js Dependencies¶
Installs Node.js packages required for the application.
Package Manager Detection:
Automatically detects and uses the appropriate package manager based on lock files:
pnpm-lock.yaml→ Usespnpmyarn.lock→ Usesyarnpackage-lock.json→ Usesnpm ci(reproducible install)package.jsononly → Usesnpm install
Installation Command:
For reproducible builds with lock file:
For standard installation without lock file:
Error Handling:
Provides detailed diagnostic information for common failure scenarios:
- Network timeout
- Package registry issues
- Peer dependency conflicts
- Node.js version incompatibility
Step 6: Post-Installation Configuration¶
Performs final setup tasks.
Version Info File:
Creates .installed_version file containing:
Script Permissions:
Makes start_hdp.sh executable if present:
Documentation Detection:
Checks for and reports presence of README files.
Output & Logging¶
Log File¶
All operations are logged to timestamped file:
Format: install_hdp_YYYYMMDD_HHMMSS.log
Location: Moved to installation directory after successful completion
Content: Includes all commands executed, their output, and diagnostic information
Console Output¶
Color-Coded Messages:
- 🔴 Red: Errors requiring immediate attention
- ⚠️ Yellow: Warnings that do not stop installation
- ✅ Green: Success messages
- 🔵 Blue: Debug information
- 🔷 Cyan: Step headers
Log Levels:
ERROR: Critical failuresWARN: WarningsINFO: General informationDEBUG: Detailed diagnostic data
Error Handling¶
Cleanup on Error¶
When installation fails, the script offers to remove the incomplete installation directory:
Interactive Mode:
⚠️ Cleaning up failed installation...
Remove incomplete installation directory './heritage-data-processor'? (y/n):
Non-Interactive Mode:
Preserves the installation directory for manual inspection.
Common Error Scenarios¶
Missing Required Tools:
🔴 Missing required tools: uv
Installation instructions:
- git: https://git-scm.com/downloads
- uv: curl -LsSf https://astral.sh/uv/install.sh | sh
- npm: https://nodejs.org/ (comes with Node.js)
After installation, restart your terminal.
Insufficient Disk Space:
Installation Directory Exists:
🔴 Installation directory already exists: ./heritage-data-processor
Choose a different location with --dir or remove the existing directory.
Network Connectivity Failure:
Installation Summary¶
Upon successful completion, displays comprehensive summary:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
✅ Installation Complete!
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📦 Installation Summary:
Location: ./heritage-data-processor
Version: v1.2.0
Python Environment: .venv (Python 3.11)
Node Modules: node_modules
Duration: 245s
Log File: ./heritage-data-processor/install_hdp_20251021_143000.log
🚀 Next Steps:
Start the application:
./heritage-data-processor/start_hdp.sh
Or navigate to project directory first:
cd ./heritage-data-processor
./start_hdp.sh