geno_lewm.carbon_zero_shot¶
carbon_zero_shot
¶
Carbon zero-shot baseline scoring artifacts for release evaluation.
CarbonZeroShotRecord
dataclass
¶
CarbonZeroShotRecord(chrom: str, pos: int, ref: str, alt: str, carbon_ref_log_likelihood: float, carbon_alt_log_likelihood: float, carbon_alt_minus_ref_log_likelihood: float, carbon_zero_shot_score: float, window_start_bp: int, window_bp: int, reference_window_sha256: str, alternate_window_sha256: str)
One Carbon zero-shot score row for geno-lewm-eval baseline input.
to_json_dict
¶
Return the JSONL row consumed by geno-lewm-eval.
Source code in geno_lewm/carbon_zero_shot.py
CarbonZeroShotSummary
dataclass
¶
CarbonZeroShotSummary(generated_by: str, generated_at: str, carbon_model: str, carbon_revision: str, vcf: str, fasta: str, output_scores: str, score_field: str, records: int, window_bp: int, logp_cache: str | None, logp_cache_entries: int, new_logp_evaluations: int, local_files_only: bool)
Machine-readable summary for a generated baseline score artifact.
to_json_dict
¶
Return the JSON-native summary payload.
Source code in geno_lewm/carbon_zero_shot.py
CarbonLogLikelihoodScorer
¶
CarbonLogLikelihoodScorer(model: object, tokenizer: object, *, torch: object, device: str | None = None)
Compute autoregressive Carbon log-likelihood for one DNA window.
Source code in geno_lewm/carbon_zero_shot.py
__call__
¶
Return summed next-token log-likelihood for a Carbon DNA window.
Source code in geno_lewm/carbon_zero_shot.py
load_carbon_logp_scorer
¶
load_carbon_logp_scorer(model_dir: str | Path, *, revision: str = 'main', dtype: str = 'bf16', device: str | None = None, trust_remote_code: bool = False, local_files_only: bool = True) -> CarbonLogLikelihoodScorer
Load a local Carbon language-model scorer through Transformers.
Source code in geno_lewm/carbon_zero_shot.py
write_carbon_zero_shot_scores
¶
write_carbon_zero_shot_scores(*, vcf_path: str | Path, fasta_path: str | Path, output_scores: str | Path, scorer: Callable[[str], float], carbon_model: str, carbon_revision: str, window_bp: int = DEFAULT_WINDOW_BP, logp_cache_jsonl: str | Path | None = None, metadata_output: str | Path | None = None, generated_at: str | None = None, local_files_only: bool = True) -> CarbonZeroShotSummary
Write Carbon zero-shot baseline scores for all VCF alternate alleles.