Metadata Tutorial
Sicifus allows you to integrate external metadata (e.g., experimental conditions, sequence properties, etc.) with your structural data.
Loading Metadata
Load metadata from a CSV file. The file must contain an id column that matches the structure file names (minus the extension).
from sicifus import Sicifus
db = Sicifus(db_path="./my_db")
# Load a metadata CSV — id column is matched to structure_id
db.load_metadata("path/to/summaries.csv")
# Loaded metadata 'summaries': 994 rows, 18 columns
# 994/994 rows match structures in the database
Exploring Metadata
Once loaded, metadata is available via db.meta (LazyFrame) or db.meta_columns().
# See what columns are available
print(db.meta_columns())
# ['clash_score', 'radius_of_gyration', 'protein_length', ...]
# Quick histogram of any column
db.hist("radius_of_gyration")
db.hist("protein_length", bins=50)
# Scatter plot
db.scatter("protein_length", "radius_of_gyration")
Combining with Clustering
You can color-code plots by cluster assignment to see if structural clusters correlate with metadata properties.
# Color by cluster (after annotate_clusters)
db.hist("radius_of_gyration", color_by="cluster")
db.scatter("protein_length", "radius_of_gyration", color_by="cluster")
Custom Queries with Polars
You can access the raw metadata as a Polars LazyFrame for custom queries.