Git Integration for Developers¶
This document explains the technical implementation of biotope's Git-on-Top strategy for metadata version control.
Architecture Overview¶
Biotope implements a Git-on-Top strategy where all version control operations delegate to Git via subprocess.run()
calls. No custom version control is implemented.
Key Principles¶
- Git Wrapper Pattern: All biotope commands wrap Git operations with metadata-specific logic
- No Custom Version Control: Zero custom commit history, branching, or remote handling
- Separation of Concerns: Data files in
data/
, metadata in.biotope/datasets/
- Croissant ML Integration: Metadata follows Croissant ML standard with validation
- Project Metadata Management: Project-level metadata stored in
.biotope/config/biotope.yaml
for annotation pre-fill
Project Structure¶
your-project/
├── .git/ # Git repository (handled by Git)
├── .biotope/ # Git-tracked metadata
│ ├── datasets/ # Croissant ML JSON-LD files
│ ├── config/ # Biotope configuration (Git-like approach)
│ │ └── biotope.yaml # Consolidated configuration and project metadata
│ ├── workflows/ # Bioinformatics workflows
│ └── logs/ # Command execution logs
├── data/ # Data files (not in Git)
│ ├── raw/
│ └── processed/
├── config/ # User configuration
├── schemas/ # Knowledge schemas
└── outputs/ # Generated outputs
Implementation Details¶
Core Commands¶
biotope init
(biotope/commands/init.py
)¶
def _init_git_repo(directory: Path) -> None:
"""Initialize a Git repository in the directory."""
subprocess.run(["git", "init"], cwd=directory, check=True)
subprocess.run(["git", "add", "."], cwd=directory, check=True)
subprocess.run(["git", "commit", "-m", "Initial biotope project setup"], cwd=directory, check=True)
def _collect_project_metadata() -> Dict:
"""Collect project-level metadata for annotation pre-fill."""
metadata = {}
if click.confirm("Would you like to collect project-level metadata for annotation pre-fill?"):
metadata["description"] = click.prompt("Project description")
metadata["url"] = click.prompt("Project URL")
metadata["creator"] = {
"name": click.prompt("Creator name"),
"email": click.prompt("Creator email")
}
metadata["license"] = click.prompt("License")
metadata["citation"] = click.prompt("Citation")
return metadata
def _create_biotope_config(biotope_root: Path, config: Dict) -> None:
"""Create biotope configuration file with project metadata."""
config_dir = biotope_root / ".biotope" / "config"
config_dir.mkdir(parents=True, exist_ok=True)
config_data = {
"project_name": config.get("project_name"),
"git_integration": config.get("git_integration", True),
"knowledge_graph": config.get("knowledge_graph", {}),
"project_metadata": config.get("project_metadata", {}),
"annotation_validation": config.get("annotation_validation", {})
}
with open(config_dir / "biotope.yaml", "w") as f:
yaml.dump(config_data, f, default_flow_style=False)
- Creates
.biotope/
directory structure - Initializes Git repository with
subprocess.run(["git", "init"])
- Collects project-level metadata for annotation pre-fill
- Creates initial commit with project setup
- Conditionally shows output format selection only when knowledge graph is enabled
biotope config
(biotope/commands/config.py
)¶
def set_project_metadata() -> None:
"""Set project-level metadata for annotation pre-fill."""
biotope_root = find_biotope_root()
if not biotope_root:
click.echo("❌ Not in a biotope project. Run 'biotope init' first.")
raise click.Abort
config = load_biotope_config(biotope_root)
project_metadata = config.get("project_metadata", {})
# Interactive metadata collection
project_metadata["description"] = click.prompt("Project description", default=project_metadata.get("description", ""))
project_metadata["url"] = click.prompt("Project URL", default=project_metadata.get("url", ""))
project_metadata["creator"] = {
"name": click.prompt("Creator name", default=project_metadata.get("creator", {}).get("name", "")),
"email": click.prompt("Creator email", default=project_metadata.get("creator", {}).get("email", ""))
}
project_metadata["license"] = click.prompt("License", default=project_metadata.get("license", ""))
project_metadata["citation"] = click.prompt("Citation", default=project_metadata.get("citation", ""))
# Update configuration
config["project_metadata"] = project_metadata
save_biotope_config(biotope_root, config)
def show_project_metadata() -> None:
"""Display current project-level metadata configuration."""
biotope_root = find_biotope_root()
if not biotope_root:
click.echo("❌ Not in a biotope project. Run 'biotope init' first.")
raise click.Abort
config = load_biotope_config(biotope_root)
project_metadata = config.get("project_metadata", {})
console = Console()
console.print("\n[bold blue]Project Metadata Configuration[/]")
console.print(f"Description: {project_metadata.get('description', 'Not set')}")
console.print(f"URL: {project_metadata.get('url', 'Not set')}")
console.print(f"Creator: {project_metadata.get('creator', {}).get('name', 'Not set')} ({project_metadata.get('creator', {}).get('email', 'Not set')})")
console.print(f"License: {project_metadata.get('license', 'Not set')}")
console.print(f"Citation: {project_metadata.get('citation', 'Not set')}")
- Manages project-level metadata for annotation pre-fill
- Provides interactive metadata collection and display
- Stores metadata in
.biotope/config/biotope.yaml
biotope commit
(biotope/commands/commit.py
)¶
def _create_git_commit(biotope_root: Path, message: str, author: Optional[str], amend: bool) -> Optional[str]:
"""Create a Git commit for .biotope/ changes."""
cmd = ["git", "commit"]
if amend:
cmd.append("--amend")
if author:
cmd.extend(["--author", author])
cmd.extend(["-m", message])
result = subprocess.run(cmd, cwd=biotope_root, capture_output=True, text=True, check=True)
# Extract commit hash from output
- Validates Croissant ML metadata before committing
- Stages
.biotope/
directory withgit add .biotope/
- Delegates actual commit to Git via
subprocess.run()
- Supports standard Git options (
--amend
,--author
)
biotope status
(biotope/commands/status.py
)¶
def _get_git_status(biotope_root: Path, biotope_only: bool) -> Dict[str, List]:
"""Get Git status for .biotope/ directory."""
cmd = ["git", "status", "--porcelain"]
if biotope_only:
cmd.append(".biotope/")
result = subprocess.run(cmd, cwd=biotope_root, capture_output=True, text=True, check=True)
# Parse Git's porcelain output format
- Uses
git status --porcelain
for machine-readable output - Parses Git's status format (A, M, D, ??)
- Focuses on
.biotope/
directory changes
biotope log
(biotope/commands/log.py
)¶
def _get_git_log(biotope_root: Path, max_count: Optional[int] = None, since: Optional[str] = None, author: Optional[str] = None, biotope_only: bool = False) -> List[Dict]:
"""Get Git log with optional filtering."""
cmd = ["git", "log", "--pretty=format:%H|%an|%ad|%s", "--date=short"]
if biotope_only:
cmd.append("--")
cmd.append(".biotope/")
result = subprocess.run(cmd, cwd=biotope_root, capture_output=True, text=True, check=True)
# Parse commit lines: hash|author|date|message
- Uses
git log
with custom format for parsing - Supports all standard Git log options
- Filters for
.biotope/
directory with-- .biotope/
biotope push
/ biotope pull
(biotope/commands/push.py
, biotope/commands/pull.py
)¶
def _push_changes(biotope_root: Path, remote: str, branch: str, force: bool) -> bool:
"""Push changes to remote repository."""
cmd = ["git", "push"]
if force:
cmd.append("--force")
cmd.extend([remote, branch])
result = subprocess.run(cmd, cwd=biotope_root, capture_output=True, text=True, check=True)
return True
- Direct delegation to
git push
andgit pull
- Supports standard Git options (
--force
,--rebase
) - No custom remote handling
Supporting Commands¶
biotope add
(biotope/commands/add.py
)¶
def _stage_git_changes(biotope_root: Path) -> None:
"""Stage .biotope/ changes in Git."""
subprocess.run(["git", "add", ".biotope/"], cwd=biotope_root, check=True)
- Creates Croissant ML metadata files in
.biotope/datasets/
- Calculates SHA256 checksums for data integrity
- Stages changes with
git add .biotope/
biotope mv
(biotope/commands/mv.py
)¶
def _execute_move(source: Path, destination: Path, biotope_root: Path, console: Console) -> None:
"""Execute the actual move operation."""
# Move the actual data file
shutil.move(str(source), str(destination))
# Update metadata files to reflect new path
metadata_files = _find_metadata_files_for_file(source, biotope_root)
for metadata_file in metadata_files:
_update_metadata_file_path(metadata_file, str(source_rel), str(destination_rel), new_checksum, biotope_root)
# Move metadata files to mirror new data file structure
new_metadata_path = biotope_root / ".biotope" / "datasets" / destination_rel.with_suffix(".jsonld")
shutil.move(str(metadata_file), str(new_metadata_path))
# Stage changes for commit
_stage_git_changes(biotope_root)
- Moves data files to new locations using
shutil.move()
- Updates Croissant ML metadata files to reflect new paths
- Recalculates SHA256 checksums for moved files
- Moves metadata files to mirror the new data file structure
- Stages changes with
git add .biotope/
- Supports both file and directory moves (with
--recursive
flag) - Validates move operations (no moving outside project, no overwriting without
--force
)
Move Command Implementation Details¶
The biotope mv
command implements several key functions:
File Move Operations¶
def _execute_move(source: Path, destination: Path, biotope_root: Path, console: Console) -> None:
"""Execute the actual move operation for a single file."""
# Create destination directory structure
destination.parent.mkdir(parents=True, exist_ok=True)
# Find all metadata files referencing the source file
metadata_files = _find_metadata_files_for_file(source, biotope_root)
# Move the actual data file
shutil.move(str(source), str(destination))
# Calculate new checksum for moved file
new_checksum = calculate_file_checksum(destination)
# Update metadata files and move them to new locations
for metadata_file in metadata_files:
_update_metadata_file_path(metadata_file, str(source_rel), str(destination_rel), new_checksum, biotope_root)
# Move metadata file to mirror new data file structure
new_metadata_path = biotope_root / ".biotope" / "datasets" / destination_rel.with_suffix(".jsonld")
if new_metadata_path != metadata_file:
shutil.move(str(metadata_file), str(new_metadata_path))
Directory Move Operations¶
def _execute_directory_move(source: Path, destination: Path, biotope_root: Path, console: Console) -> None:
"""Execute move operation for a directory and all its tracked files."""
# Find all tracked files in the source directory
tracked_files = _find_tracked_files_in_directory(source, biotope_root)
# Move the entire directory structure
shutil.move(str(source), str(destination))
# Handle metadata updates based on move type
if is_simple_rename:
# For simple renames, rename entire metadata directory structure
shutil.move(str(source_metadata_dir), str(destination_metadata_dir))
else:
# For moves to different locations, handle each file individually
for old_file_path in tracked_files:
new_file_path = destination / old_file_path.relative_to(source)
_update_metadata_for_moved_file(old_file_path, new_file_path, biotope_root)
Metadata Update Functions¶
def _update_metadata_file_path(metadata_file: Path, old_path: str, new_path: str, new_checksum: str, biotope_root: Path) -> bool:
"""Update metadata file to reflect new file path and checksum."""
with open(metadata_file) as f:
metadata = json.load(f)
updated = False
for distribution in metadata.get("distribution", []):
if distribution.get("@type") == "sc:FileObject":
if distribution.get("contentUrl") == old_path:
distribution["contentUrl"] = new_path
distribution["sha256"] = new_checksum
distribution["dateModified"] = datetime.now(timezone.utc).isoformat()
updated = True
if updated:
with open(metadata_file, "w") as f:
json.dump(metadata, f, indent=2)
return updated
Validation Functions¶
def _validate_move_operation(source: Path, destination: Path, biotope_root: Path, force: bool) -> Path:
"""Validate the move operation before executing."""
# Ensure destination is within project bounds
try:
actual_destination.relative_to(biotope_root)
except ValueError:
raise click.Abort("Cannot move file outside biotope project")
# Prevent moving biotope internal files
if str(source_rel).startswith(".biotope/"):
raise click.Abort("Cannot move biotope internal files")
# Check for destination conflicts
if actual_destination.exists() and not force:
raise click.Abort("Destination exists. Use --force to overwrite.")
return actual_destination
biotope check-data
(biotope/commands/check_data.py
)¶
def _get_recorded_checksum(file_path: Path, biotope_root: Path) -> Optional[str]:
"""Get the recorded checksum for a file."""
datasets_dir = biotope_root / ".biotope" / "datasets"
for dataset_file in datasets_dir.glob("*.jsonld"):
with open(dataset_file) as f:
metadata = json.load(f)
for distribution in metadata.get("distribution", []):
if distribution.get("@type") == "sc:FileObject":
content_url = distribution.get("contentUrl")
if content_url and (biotope_root / content_url) == file_path:
return distribution.get("sha256")
- Reads checksums from Croissant ML metadata files
- Validates data integrity against recorded checksums
- No version control functionality
Project Metadata Integration¶
Configuration Management¶
Project metadata is managed through a centralized configuration system:
def load_biotope_config(biotope_root: Path) -> Dict:
"""Load biotope configuration with project metadata."""
config_path = biotope_root / ".biotope" / "config" / "biotope.yaml"
if not config_path.exists():
return {}
try:
with open(config_path) as f:
config = yaml.safe_load(f) or {}
except (yaml.YAMLError, IOError):
return {}
return config
def save_biotope_config(biotope_root: Path, config: Dict) -> None:
"""Save biotope configuration with project metadata."""
config_path = biotope_root / ".biotope" / "config" / "biotope.yaml"
config_path.parent.mkdir(parents=True, exist_ok=True)
with open(config_path, "w") as f:
yaml.dump(config, f, default_flow_style=False)
Annotation Pre-fill Integration¶
The annotation system integrates project metadata for pre-filling:
def load_project_metadata(biotope_root: Path) -> Dict:
"""Load project-level metadata from biotope configuration for pre-filling annotations."""
config_path = biotope_root / ".biotope" / "config" / "biotope.yaml"
if not config_path.exists():
return {}
try:
import yaml
with open(config_path) as f:
config = yaml.safe_load(f) or {}
except (yaml.YAMLError, IOError):
return {}
# Extract project metadata from configuration
project_metadata = config.get("project_metadata", {})
# Convert to Croissant format for pre-filling
croissant_metadata = {}
if project_metadata.get("description"):
croissant_metadata["description"] = project_metadata["description"]
if project_metadata.get("url"):
croissant_metadata["url"] = project_metadata["url"]
if project_metadata.get("creator"):
croissant_metadata["creator"] = project_metadata["creator"]
if project_metadata.get("license"):
croissant_metadata["license"] = project_metadata["license"]
if project_metadata.get("citation"):
croissant_metadata["citation"] = project_metadata["citation"]
return croissant_metadata
Configuration File Structure¶
The .biotope/config/biotope.yaml
file structure (Git-like approach):
version: "1.0"
croissant_schema_version: "1.0"
default_metadata_template: "scientific"
data_storage:
type: "local"
path: "data"
checksum_algorithm: "sha256"
auto_stage: true
commit_message_template: "Update metadata: {description}"
# Project information (consolidated internal metadata)
project_info:
name: "my-project"
created_at: "2024-01-01T00:00:00Z"
biotope_version: "0.1.0"
last_modified: "2024-01-01T00:00:00Z"
builds: []
knowledge_sources: []
# Project-level metadata for annotation pre-fill
project_metadata:
description: "Project description"
url: "https://example.com/project"
creator:
name: "John Doe"
email: "john@example.com"
license: "MIT"
citation: "Doe, J. (2024). Project Title. Journal Name."
# Validation configuration
annotation_validation:
enabled: true
minimum_required_fields:
- "name"
- "description"
- "creator"
- "dateCreated"
- "distribution"
field_validation:
name:
type: "string"
min_length: 1
description:
type: "string"
min_length: 10
creator:
type: "object"
required_keys: ["name"]
dateCreated:
type: "string"
format: "date"
distribution:
type: "array"
min_length: 1
remote_config:
url: "https://cluster.example.com/validation.yaml"
cache_duration: 3600
fallback: true
Git Integration Patterns¶
Common Helper Functions¶
All commands use these shared helper functions:
def _is_git_repo(directory: Path) -> bool:
"""Check if directory is a Git repository."""
try:
result = subprocess.run(["git", "rev-parse", "--git-dir"], cwd=directory, capture_output=True, text=True, check=True)
return True
except subprocess.CalledProcessError:
return False
def find_biotope_root() -> Optional[Path]:
"""Find the biotope project root directory."""
current = Path.cwd()
while current != current.parent:
if (current / ".biotope").exists():
return current
current = current.parent
return None
Error Handling¶
All commands follow consistent error handling:
try:
result = subprocess.run(cmd, cwd=biotope_root, capture_output=True, text=True, check=True)
return result.stdout.strip()
except subprocess.CalledProcessError as e:
click.echo(f"❌ Git error: {e}")
if e.stderr:
click.echo(f"Error details: {e.stderr}")
return None
Metadata Validation¶
All metadata is validated against the Croissant ML schema before being committed. The validation process checks:
- Required Fields: Ensures all minimum required fields are present
- Field Types: Validates data types match expected schemas
- Field Constraints: Checks length, format, and content requirements
- Schema Compliance: Ensures metadata follows Croissant ML standards
Remote Validation Configuration¶
Biotope supports remote validation configurations to enforce institutional or cluster-wide policies. This allows administrators to maintain centralized validation requirements that are automatically applied to all projects.
Architecture¶
def load_biotope_config(biotope_root: Path) -> Dict:
"""Load biotope configuration with remote validation support."""
config = load_local_config(biotope_root)
# Check for remote validation configuration
remote_config = config.get("annotation_validation", {}).get("remote_config", {})
if remote_config and remote_config.get("url"):
remote_validation = _load_remote_validation_config(remote_config, biotope_root)
if remote_validation:
# Merge remote config with local config (local takes precedence)
merged_validation = _merge_validation_configs(remote_validation, validation_config)
config["annotation_validation"] = merged_validation
return config
Configuration Structure¶
# .biotope/config/biotope.yaml
annotation_validation:
enabled: true
remote_config:
url: "https://cluster.example.com/biotope-validation.yaml"
cache_duration: 3600 # seconds
fallback_to_local: true
# Local overrides (optional)
minimum_required_fields: ["name", "description", "creator"]
Remote Configuration Format¶
# https://cluster.example.com/biotope-validation.yaml
annotation_validation:
enabled: true
minimum_required_fields:
- name
- description
- creator
- dateCreated
- distribution
- license
field_validation:
name:
type: string
min_length: 1
description:
type: string
min_length: 10
creator:
type: object
required_keys: [name]
dateCreated:
type: string
format: date
distribution:
type: array
min_length: 1
license:
type: string
min_length: 5
Caching Strategy¶
Remote configurations are cached locally to improve performance and enable offline operation:
def _load_remote_validation_config(remote_config: Dict, biotope_root: Path) -> Optional[Dict]:
"""Load validation configuration from a remote URL with caching."""
url = remote_config["url"]
cache_duration = remote_config.get("cache_duration", 3600)
# Check cache first
cache_file = _get_cache_file_path(url, biotope_root)
if cache_file.exists():
cache_age = datetime.now() - datetime.fromtimestamp(cache_file.stat().st_mtime)
if cache_age.total_seconds() < cache_duration:
return load_cached_config(cache_file)
# Fetch from remote and cache
remote_config_data = fetch_remote_config(url)
cache_remote_config(remote_config_data, cache_file)
return remote_config_data
Configuration Merging¶
Local configurations can extend or override remote requirements:
def _merge_validation_configs(remote_config: Dict, local_config: Dict) -> Dict:
"""Merge remote and local validation configurations."""
merged = remote_config.copy()
# Merge required fields (union)
remote_fields = set(remote_config.get("minimum_required_fields", []))
local_fields = set(local_config.get("minimum_required_fields", []))
merged["minimum_required_fields"] = list(remote_fields | local_fields)
# Merge field validation (local overrides remote)
remote_validation = remote_config.get("field_validation", {})
local_validation = local_config.get("field_validation", {})
merged["field_validation"] = {**remote_validation, **local_validation}
return merged
CLI Commands¶
# Set remote validation URL
biotope config set-remote-validation --url https://cluster.example.com/validation.yaml
# Show remote validation status
biotope config show-remote-validation
# Clear validation cache
biotope config clear-validation-cache
# Remove remote validation
biotope config remove-remote-validation
Use Cases¶
- Institutional Clusters: Enforce consistent metadata standards across all research projects
- Multi-site Collaborations: Share validation requirements between institutions
- Compliance Requirements: Ensure datasets meet regulatory or funding requirements
- Quality Assurance: Maintain high metadata quality standards
Implementation Notes¶
- Fallback Behavior: Projects can fall back to local configuration if remote is unavailable
- Cache Management: Automatic cache invalidation based on configurable duration
- Security: HTTPS URLs recommended for production use
- Performance: Caching reduces network overhead and improves reliability
Croissant ML Integration¶
Metadata Structure¶
All metadata follows Croissant ML standard:
{
"@context": {"@vocab": "https://schema.org/"},
"@type": "Dataset",
"name": "experiment-dataset",
"description": "RNA-seq experiment data",
"distribution": [
{
"@type": "sc:FileObject",
"@id": "file_abc123",
"name": "experiment.csv",
"contentUrl": "data/raw/experiment.csv",
"sha256": "abc123...",
"contentSize": 1024
}
]
}
Data Integrity¶
- SHA256 checksums embedded in metadata
- Automatic validation before commits
- Data integrity checking via
biotope check-data
Testing¶
Test Structure¶
Tests verify Git integration without mocking Git:
def test_commit_success(self, runner, biotope_project):
"""Test successful commit."""
with patch("biotope.commands.commit.find_biotope_root", return_value=biotope_project):
# Create a change
with open(biotope_project / ".biotope" / "datasets" / "new.jsonld", "w") as f:
json.dump({"name": "new-dataset"}, f)
result = runner.invoke(commit, ["-m", "Add new dataset"])
assert result.exit_code == 0
assert "Commit" in result.output
Git Command Testing¶
Tests use actual Git commands:
# Initialize Git repository
subprocess.run(["git", "init"], cwd=tmp_path, check=True)
subprocess.run(["git", "add", "."], cwd=tmp_path, check=True)
subprocess.run(["git", "commit", "-m", "Initial commit"], cwd=tmp_path, check=True)
Configuration¶
Biotope Configuration¶
.biotope/config/biotope.yaml
:
version: "1.0"
croissant_schema_version: "1.0"
default_metadata_template: "scientific"
data_storage:
type: "local"
path: "data"
checksum_algorithm: "sha256"
auto_stage: true
commit_message_template: "Update metadata: {description}"
Git Configuration¶
Standard Git configuration applies - no biotope-specific Git config.
Security Considerations¶
- No custom version control reduces attack surface
- Git's battle-tested security model applies
- Checksums provide data integrity verification
- Metadata validation prevents malformed commits
Performance¶
- Git operations are delegated to native Git binary
- No custom parsing or storage overhead
- Metadata files are small JSON-LD files
- Checksum calculation only on file changes
Future Enhancements¶
Planned improvements while maintaining Git-on-Top:
- Enhanced Croissant ML validation
- Metadata conflict resolution tools
- Integration with external metadata repositories
- Workflow automation features
Conclusion¶
The Git-on-Top implementation provides:
- Reliability: Battle-tested Git infrastructure
- Simplicity: No custom version control complexity
- Familiarity: Standard Git workflows and tools
- Maintainability: Minimal custom code to maintain
- Performance: Native Git performance
This approach eliminates the need for custom version control while providing robust metadata management capabilities.
Developer & Admin Guide: Annotation Validation¶
This document describes the internals and configuration of the annotation validation system in Biotope (git-on-top mode).
Configuration Structure¶
Annotation validation is configured in .biotope/config/biotope.yaml
under the annotation_validation
key:
annotation_validation:
enabled: true
minimum_required_fields:
- name
- description
- creator
- dateCreated
- distribution
field_validation:
name:
type: string
min_length: 1
description:
type: string
min_length: 10
creator:
type: object
required_keys: [name]
dateCreated:
type: string
format: date
distribution:
type: array
min_length: 1
- enabled: Toggle validation on/off.
- minimum_required_fields: List of fields that must be present in each metadata file.
- field_validation: Per-field validation rules (type, min_length, required_keys, etc).
Validation Logic¶
Validation is implemented in biotope/validation.py
:
- is_metadata_annotated(metadata, config)
checks if a metadata dict meets requirements.
- _validate_field(value, field_name, validation_rules)
applies per-field rules.
- get_annotation_status_for_files(biotope_root, file_paths)
returns annotation status for a list of files.
The system supports:
- Type checks (string
, object
, array
)
- Minimum length for strings/arrays
- Required keys for objects
- ISO date format for date fields
Extending Validation¶
To add new validation rules:
- Update the field_validation
structure in the config (via CLI or manually).
- Extend _validate_field
in biotope/validation.py