Update Script Reference¶
The Heritage Data Processor Update Script (update_hdp.sh) safely updates the repository and dependencies with extensive validation, backup, and rollback capabilities.
Overview¶
This production-ready update script provides automated updates with safety features including dry-run mode, automatic backups, intelligent file cleanup strategies, and comprehensive error recovery.
Usage¶
Basic Usage¶
Updates to latest version with interactive prompts for version selection and file cleanup decisions.
Command-Line Options¶
Options:
--dry-run: Preview changes without applying them--non-interactive: Skip interactive prompts, use safe defaults--branch BRANCH: Update to specific branch. Default:main--version TAG: Update to specific version tag--latest: Update to latest version. Default behavior--cleanup: Aggressively remove files not in target version--help: Display usage information and exit
Usage Examples¶
Example 1: Standard Update¶
Interactive update to latest version with file preservation prompts.
Example 2: Specific Version¶
Updates directly to version 1.2.5.
Example 3: Automated Update¶
Fully automated update with aggressive cleanup for CI/CD.
Example 4: Preview Changes¶
Preview all changes without applying them.
Example 5: Branch Update¶
Updates to development branch instead of tagged release.
Requirements¶
Required Tools¶
git: Version control operationsuv: Python package management
Optional Tools¶
npm,pnpm, oryarn: Node.js package management (detected automatically)
System Resources¶
- Disk Space: Minimum 500 MB free space
- Memory: At least 100 MB available (warning if less)
- Network: Internet connectivity for remote repository access
Update Process¶
Pre-Flight Checks¶
Comprehensive validation before any changes are made.
System Validation:
- Tool Availability: Verifies
gitanduvare installed - Git Repository: Confirms running inside a git repository with commits
- Disk Space: Ensures at least 500 MB available
- Memory: Checks available RAM (warns if low)
- Git Configuration: Validates
user.nameanduser.emailare set - Network: Tests connectivity to remote repository
Git State Validation:
- Merge/Rebase: Detects ongoing merge or rebase operations
- Remote: Verifies
originremote is configured - Connectivity: Tests
git ls-remoteto confirm repository access
Error Messages:
Git Not Configured:
🔴 Git user.name and user.email must be configured.
Run:
git config --global user.name 'Your Name'
git config --global user.email 'your.email@example.com'
Merge in Progress:
🔴 Merge in progress. Complete or abort the merge first:
git merge --abort # to abort
git merge --continue # to complete
Cannot Connect to Remote:
🔴 Cannot connect to remote repository.
Check:
- Internet connectivity
- VPN connection if required
- SSH keys or credentials
- Repository access permissions
Step 1: Version Selection¶
Fetches available versions and allows user to select target.
Process:
- Fetch Remote: Runs
git fetch --all --tags --prune - Current Version: Displays current tag or branch
- Available Versions: Lists tags sorted by semantic version
- Selection: Interactive or automatic based on mode
Interactive Selection:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Available Versions
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
0) Latest (main branch)
1) v1.3.0
2) v1.2.5
3) v1.2.4
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Select version (0-15, default: 0):
Non-Interactive Mode:
Automatically selects latest tagged version or uses --version parameter.
Step 2: File Change Analysis¶
Analyzes which files will be added, modified, or deleted during update.
Analysis Process:
- Get Target Commit: Resolves target version to commit SHA
- Compare Trees: Uses
git diffto compare current and target - Categorize Changes:
- Tracked files to delete (exist in current, not in target)
- Untracked files (not in git, may be user data)
- Files to modify (changed between versions)
- Files to add (new in target version)
Display Format:
[INFO] Files that will be updated/added: 23
Modified files:
~ server_app/routes/zenodo.py
~ requirements.txt
~ package.json
New files:
+ server_app/routes/pipeline_manager.py
+ docs/api/pipeline.md
Deletion Warning:
⚠️ 15 file(s) would be affected by cleanup:
- 3 tracked file(s) removed in target version
- 12 untracked file(s) (not in git)
Files that would be deleted:
- [tracked] old_module.py
- [tracked] deprecated_script.sh
- [untracked] my_config.yaml
- [untracked] user_data.json
File Cleanup Strategies¶
Three strategies for handling files that don't exist in target version:
Strategy 1: Delete All Files (clean-all)
Removes all files not present in target version, including untracked files:
- Enabled with
--cleanupflag - Most aggressive cleanup
- Suitable for fresh installs or CI environments
Strategy 2: Preserve Untracked Files
Removes tracked files deleted from target, but keeps untracked files:
- Balances cleanup with data preservation
- Removes obsolete tracked files
- Protects user-created files
Strategy 3: Preserve All Files
Keeps all extra files regardless of status:
- Default in non-interactive mode
- Safest option for production
- Prevents accidental data loss
Interactive Selection:
Options:
1) Delete all files (clean update)
2) Keep untracked files only (delete tracked removals)
3) Keep all files (preserve everything)
Choose option (1-3, default: 3):
Step 3: Backup Creation¶
Creates automatic backup of local changes before updating.
Stash Process:
- Detect Changes: Checks for unstaged, staged, and untracked changes
- Create Stash: Uses
git stash pushwith descriptive message - Verify Stash: Confirms stash was created successfully
Stash Naming:
Format: update-backup-{timestamp}
Example: update-backup-1698765432
Stash Strategy:
Depends on cleanup mode:
- clean-all: Stashes including untracked files (
-uflag) - preserve modes: Stashes tracked changes only
Dry-Run Mode:
Step 4: Repository Update¶
Updates git repository to target version using appropriate strategy.
Update Strategies:
Clean Checkout (clean-all mode):
Selective Cleanup (preserve-untracked mode):
Merge Checkout (preserve-all mode):
If merge checkout fails, uses alternative preservation method:
- Backup Files: Copies files that would be deleted to temp directory
- Checkout: Performs regular checkout
- Restore Files: Copies backed-up files back to their locations
Branch Handling:
If target is a branch (not a tag), performs pull after checkout:
Commit Information:
Displays current commit after update:
Step 5: Python Dependencies Update¶
Updates Python packages to match target version requirements.
Dependency Files:
Searches for in priority order:
pyproject.toml(modern Python projects)requirements.txt(traditional approach)
Virtual Environment:
Searches for existing venv:
.venv(preferred)venv(alternative)
If not found, creates .venv with Python 3.11:
Installation Commands:
For pyproject.toml:
For requirements.txt:
Error Handling:
Provides detailed troubleshooting for common failures:
🔴 Failed to sync Python dependencies.
Common causes:
- Package version conflicts
- Network issues
- Missing build dependencies
Check the log for details: update_hdp_20251021_140000.log
Try manually:
uv sync --verbose
Step 6: Node.js Dependencies Update¶
Updates Node.js packages using detected package manager.
Package Manager Detection:
Automatic detection based on lock files:
pnpm-lock.yaml→pnpmyarn.lock→yarnpackage-lock.json→npmpackage.jsononly →npm(with warning)
Installation:
Error Diagnostics:
🔴 Failed to install Node.js dependencies.
Common causes:
- Network issues
- Package version conflicts
- Peer dependency issues
- Registry authentication
Check the log: update_hdp_20251021_140000.log
Try manually:
npm install --verbose
Step 7: Version Info Update¶
Updates metadata file tracking installation version.
File Location: .installed_version
Contents:
Purpose: Tracks which version is currently installed for troubleshooting.
Step 8: Restore Stashed Changes¶
Attempts to restore previously stashed changes.
Restoration:
Conflict Handling:
If conflicts occur during stash pop, provides detailed resolution instructions:
═══════════════════════════════════════════════════════════
MANUAL INTERVENTION REQUIRED
═══════════════════════════════════════════════════════════
Your stashed changes conflict with the updated code.
Your changes are safe in: update-backup-1698765432
To resolve:
1. Check conflicts: git status
2. Edit conflicting files and resolve markers (<<<<, ====, >>>>)
3. Stage resolved files: git add <file>
4. Test your changes
5. Drop the stash: git stash drop
To abort and restore original state:
git reset --hard
git stash pop
Log file: update_hdp_20251021_140000.log
═══════════════════════════════════════════════════════════
No Stash:
If no changes were stashed, skips this step:
Rollback & Recovery¶
Automatic Rollback¶
On error, attempts to restore previous state:
cleanup_on_error() {
info "Attempting to restore previous state..."
if [ -n "$BACKUP_STASH" ]; then
info "Restoring from backup stash: $BACKUP_STASH"
git stash pop "stash@{0}" 2>/dev/null || true
fi
}
Trap Registration:
Cleanup function is registered for error signals:
Manual Rollback¶
If automatic rollback fails, log file contains complete history for manual recovery.
Manual Recovery Steps:
- Check log file for point of failure
- Review git status
- Restore from stash if needed
- Reset to previous commit if necessary
Dry-Run Mode¶
Purpose¶
Preview all changes without modifying anything.
Activation:
Behavior:
- Performs all checks and analysis
- Displays what would be done
- Skips actual git operations
- Skips dependency installations
- Logs all planned actions
Example Output:
⚠️ DRY RUN MODE - No changes will be made
[INFO] [DRY RUN] Would create stash: update-backup-1698765432
[INFO] [DRY RUN] Would checkout: v1.3.0
[INFO] [DRY RUN] Would delete files not in target version
[INFO] [DRY RUN] Would update Python dependencies
[INFO] [DRY RUN] Would run: npm install
[INFO] [DRY RUN] Would update version info
[INFO] [DRY RUN] Would restore stash: update-backup-1698765432
⚠️ DRY RUN completed - no changes were made
[INFO] Re-run without --dry-run to apply changes
Update Summary¶
Displays comprehensive summary after successful update: