Skip to content

Installation Script Reference

The Heritage Data Processor Installation Script (install_hdp.sh) automates the complete installation process, including repository cloning, version selection, Python environment setup, and Node.js dependency management.

Overview

This script provides a guided installation experience with comprehensive system validation, interactive version selection, and error recovery capabilities.


Usage

Basic Usage

./install_hdp.sh

Launches interactive installation with default settings: installs to ./heritage-data-processor and prompts for version selection.

Command-Line Options

./install_hdp.sh [OPTIONS]

Options:

  • --dir DIRECTORY: Install to specific directory. Defaults to ./heritage-data-processor
  • --version VERSION: Install specific version tag without interactive selection
  • --non-interactive: Skip interactive prompts, use defaults (useful for CI/CD)
  • --help: Display usage information and exit

Usage Examples

Example 1: Interactive Installation

./install_hdp.sh

Prompts for version selection and installs to default directory.


Example 2: Install Specific Version

./install_hdp.sh --version v1.2.0

Installs version 1.2.0 without version selection prompt.


Example 3: Custom Installation Directory

./install_hdp.sh --dir /opt/hdp

Installs to /opt/hdp instead of default location.


Example 4: Automated Installation

./install_hdp.sh --non-interactive --version v1.2.0 --dir /var/apps/hdp

Fully automated installation for CI/CD pipelines.


Requirements

Required Tools

  • git: Version control for repository cloning
  • uv: Python package installer and virtual environment manager
  • npm: Node.js package manager (comes with Node.js)
  • Python: 3.11+ (required for project compatibility)
  • Node.js: 16+ (recommended for modern package support)

System Resources

  • Disk Space: Minimum 1000 MB free space
  • Network: Internet connectivity required for GitHub access and package downloads

Installation Steps

Step 1: System Validation

Validates system prerequisites before beginning installation.

Checks Performed:

  • Tool Availability: Verifies git, uv, and npm are installed and accessible
  • Version Compatibility: Checks Python (3.8+ minimum, 3.11+ recommended) and Node.js (16+ recommended) versions
  • Disk Space: Ensures at least 1000 MB available
  • Directory Conflict: Verifies installation directory does not already exist
  • Network Connectivity: Tests connection to GitHub

Exit Conditions:

If any required tool is missing, the script provides installation instructions and exits with error code 1.

Example Output:

[INFO] Step 1: System Validation
[DEBUG] git: git version 2.39.0
[DEBUG] uv: uv 0.1.20
[DEBUG] npm: 9.6.7
[DEBUG] Python version: 3.11.5
[DEBUG] Node.js version: 20.5.0
[DEBUG] Available disk space: 15234MB
✅ All required tools are installed
✅ Network connectivity verified
✅ Pre-flight checks completed

Step 2: Version Selection

Fetches available versions from repository and prompts user for selection.

Interactive Mode:

Displays up to 15 most recent tagged versions with commit dates and messages:

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Available Versions
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  0) Latest (main branch) - Most recent development version
  1) v1.2.0              2025-10-15 Release: Enhanced validation
  2) v1.1.5              2025-09-20 Bugfix: Pipeline execution
  3) v1.1.4              2025-09-10 Feature: Batch processing
  ... and 42 more versions
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Select version to install (0-15, default: 0): 

Non-Interactive Mode:

Automatically selects the latest tagged version without prompting.

Command-Line Version:

When --version TAG is specified, validates the tag exists in the repository.


Step 3: Cloning Repository

Clones the Heritage Data Processor repository from GitHub.

Repository URL:

https://github.com/Digital-Humanities-Jena/heritage-data-processor.git

Process:

  1. Clone: Executes git clone to download repository
  2. Navigate: Changes working directory to cloned repository
  3. Checkout: If a specific version was selected, checks out that tag
  4. Verify: Displays current commit information

Output Logging:

All git operations are logged to both console and log file for troubleshooting.

Error Recovery:

If cloning fails, provides diagnostic information about possible causes (network issues, access denied, invalid URL).


Step 4: Python Environment Setup

Creates isolated Python virtual environment and installs dependencies.

Virtual Environment:

  • Location: .venv directory in project root
  • Python Version: 3.11 (enforced)
  • Tool: Uses uv venv --python=3.11

Dependency Installation:

Installs packages from requirements.txt using uv pip install:

uv pip install --python .venv -r requirements.txt

Python Version Enforcement:

If Python 3.11 is not available, the script exits with installation instructions for the user's platform:

  • Ubuntu/Debian: sudo apt install python3.11
  • macOS: brew install python@3.11
  • Windows: Download from python.org

Verification:

After installation, counts and reports the number of installed packages.


Step 5: Node.js Dependencies

Installs Node.js packages required for the application.

Package Manager Detection:

Automatically detects and uses the appropriate package manager based on lock files:

  • pnpm-lock.yaml → Uses pnpm
  • yarn.lock → Uses yarn
  • package-lock.json → Uses npm ci (reproducible install)
  • package.json only → Uses npm install

Installation Command:

For reproducible builds with lock file:

npm ci

For standard installation without lock file:

npm install

Error Handling:

Provides detailed diagnostic information for common failure scenarios:

  • Network timeout
  • Package registry issues
  • Peer dependency conflicts
  • Node.js version incompatibility

Step 6: Post-Installation Configuration

Performs final setup tasks.

Version Info File:

Creates .installed_version file containing:

VERSION=v1.2.0
INSTALL_DATE=2025-10-21 14:30:00
COMMIT=abc123def456

Script Permissions:

Makes start_hdp.sh executable if present:

chmod +x start_hdp.sh

Documentation Detection:

Checks for and reports presence of README files.


Output & Logging

Log File

All operations are logged to timestamped file:

Format: install_hdp_YYYYMMDD_HHMMSS.log

Location: Moved to installation directory after successful completion

Content: Includes all commands executed, their output, and diagnostic information

Console Output

Color-Coded Messages:

  • 🔴 Red: Errors requiring immediate attention
  • ⚠️ Yellow: Warnings that do not stop installation
  • Green: Success messages
  • 🔵 Blue: Debug information
  • 🔷 Cyan: Step headers

Log Levels:

  • ERROR: Critical failures
  • WARN: Warnings
  • INFO: General information
  • DEBUG: Detailed diagnostic data

Error Handling

Cleanup on Error

When installation fails, the script offers to remove the incomplete installation directory:

Interactive Mode:

⚠️ Cleaning up failed installation...
Remove incomplete installation directory './heritage-data-processor'? (y/n):

Non-Interactive Mode:

Preserves the installation directory for manual inspection.

Common Error Scenarios

Missing Required Tools:

🔴 Missing required tools: uv

Installation instructions:
 - git: https://git-scm.com/downloads
 - uv: curl -LsSf https://astral.sh/uv/install.sh | sh
 - npm: https://nodejs.org/ (comes with Node.js)

After installation, restart your terminal.

Insufficient Disk Space:

🔴 Insufficient disk space: 450MB available, 1000MB required

Installation Directory Exists:

🔴 Installation directory already exists: ./heritage-data-processor
Choose a different location with --dir or remove the existing directory.

Network Connectivity Failure:

🔴 Cannot connect to GitHub.
Check:
 - Internet connectivity
 - VPN if required
 - Firewall settings

Installation Summary

Upon successful completion, displays comprehensive summary:

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
✅ Installation Complete!
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

📦 Installation Summary:
 Location: ./heritage-data-processor
 Version: v1.2.0
 Python Environment: .venv (Python 3.11)
 Node Modules: node_modules
 Duration: 245s
 Log File: ./heritage-data-processor/install_hdp_20251021_143000.log

🚀 Next Steps:

 Start the application:
 ./heritage-data-processor/start_hdp.sh

 Or navigate to project directory first:
 cd ./heritage-data-processor
 ./start_hdp.sh