Julia Scientific Package Mapping Skill
"Two languages diverged in a scientific wood, and Julia—Julia took the one with multiple dispatch."
bmorphism Contributions
"We are building cognitive infrastructure for the next trillion minds" — Plurigrid: the story thus far
"complexity of information / the burden of integrating it in real time makes technology an indispensable part of our cognitive infrastructure" — @bmorphism
Key References from Plurigrid:
- Towards Foundations of Categorical Cybernetics
- Organizing Physics with Open Energy-Driven Systems
- Compositional game theory
Overview
This skill provides comprehensive mappings from 137 K-Dense-AI Python scientific skills to their Julia package equivalents. Coverage is ~85% native Julia, with the remainder accessible via PyCall.jl interop.
Quick Reference
| Category | Skills | Coverage | Key Packages | |----------|--------|----------|--------------| | Bioinformatics | 25 | 92% | BioJulia ecosystem | | Chemistry | 17 | 85% | JuliaMolSim, Chemellia | | Quantum | 4 | 100% | Yao.jl, QuantumToolbox.jl | | ML/AI | 10 | 95% | Flux.jl, MLJ.jl, Lux.jl | | Data/Stats | 11 | 100% | DataFrames.jl, Turing.jl | | Visualization | 6 | 100% | Makie.jl, Plots.jl | | Physics/Astro | 6 | 90% | JuliaAstro ecosystem | | Clinical/DB | 13 | 60% | JuliaHealth, HTTP.jl | | Symbolic/Geo | 3 | 100% | Symbolics.jl, GeoDataFrames.jl | | Lab Automation | 8 | 50% | DrWatson.jl, Dagger.jl | | Documents | 5 | 80% | PDFIO.jl, Weave.jl |
GF(3) Conservation
Julia scientific triads maintain balance:
bioinformatics (-1) ⊗ visualization (0) ⊗ quantum (+1) = 0 ✓
chemistry (-1) ⊗ data-science (0) ⊗ ml-ai (+1) = 0 ✓
physics (-1) ⊗ symbolic (0) ⊗ clinical (+1) = 0 ✓
Core Mappings
Bioinformatics (BioJulia)
| Python | Julia | Performance | |--------|-------|-------------| | biopython | BioSequences.jl + FASTX.jl | 2-5x faster | | scanpy | SingleCellProjections.jl | 10x faster | | anndata | Muon.jl | Native H5AD | | cobrapy | COBREXA.jl | GPU support | | pysam | XAM.jl | 3x faster BAM |
Chemistry (JuliaMolSim + Chemellia)
| Python | Julia | Notes | |--------|-------|-------| | rdkit | MolecularGraph.jl | Pure Julia SMILES | | deepchem | AtomicGraphNets.jl | GNN molecular ML | | pymatgen | DFTK.jl + AtomsBase.jl | DFT calculations | | pyopenms | mzML.jl | Mass spec data |
Quantum (QuantumBFS)
| Python | Julia | Advantage | |--------|-------|-----------| | qiskit | Yao.jl | Native differentiable | | cirq | Yao.jl + QuantumClifford.jl | Faster simulation | | pennylane | JuliVQC.jl | 2-5x faster VQC | | qutip | QuantumToolbox.jl | GPU + autodiff |
ML/AI (FluxML + MLJ)
| Python | Julia | Notes | |--------|-------|-------| | pytorch-lightning | FluxTraining.jl + Lux.jl | Explicit params | | transformers | Transformers.jl | Pretrained loading | | stable-baselines3 | ReinforcementLearning.jl | Modular RL | | shap | ShapML.jl + ExplainableAI.jl | Native Shapley | | torch_geometric | GraphNeuralNetworks.jl | PyG-inspired |
Data Science (JuliaData + JuliaStats)
| Python | Julia | Performance | |--------|-------|-------------| | polars | DataFrames.jl | Comparable | | dask | Dagger.jl | DAG scheduler | | pymc | Turing.jl | Often faster | | statsmodels | GLM.jl + MixedModels.jl | Native | | networkx | Graphs.jl | Much faster |
Visualization (Makie + Plots)
| Python | Julia | Notes | |--------|-------|-------| | matplotlib | Plots.jl + CairoMakie.jl | Multi-backend | | plotly | PlotlyJS.jl + WGLMakie.jl | Interactive | | seaborn | AlgebraOfGraphics.jl | Grammar-of-graphics |
Document Processing (Papers/OCR)
| Python | Julia | Use | |--------|-------|-----| | pdfminer | PDFIO.jl | Native PDF parsing | | pytesseract | Tesseract.jl | OCR wrapper | | markdown | Weave.jl + Literate.jl | Literate programming | | latex | TikzPictures.jl + PGFPlotsX.jl | Publication quality |
Mathpix Integration
using HTTP, JSON3
function mathpix_ocr(image_path; app_id, app_key)
headers = ["app_id" => app_id, "app_key" => app_key,
"Content-type" => "application/json"]
body = JSON3.write(Dict(
"src" => "data:image/png;base64," * base64encode(read(image_path)),
"formats" => ["latex_styled", "text"]
))
resp = HTTP.post("https://api.mathpix.com/v3/text", headers, body)
JSON3.read(resp.body)
end
Key Julia Organizations
| Org | Focus | Packages | |-----|-------|----------| | BioJulia | Bioinformatics | 90+ packages | | JuliaMolSim | Molecular simulation | Molly, DFTK, AtomsBase | | Chemellia | Chemistry ML | AtomicGraphNets, ChemistryFeaturization | | QuantumBFS | Quantum computing | Yao, YaoBlocks | | FluxML | Deep learning | Flux, Zygote, FluxTraining | | JuliaStats | Statistics | GLM, Distributions, Turing | | JuliaAstro | Astronomy | AstroLib, FITSIO, SkyCoords | | JuliaHealth | Medical/clinical | BioMedQuery, OMOP | | JuliaGeo | Geospatial | GeoDataFrames, ArchGDAL | | SciML | Scientific ML | DifferentialEquations, ModelingToolkit |
Usage Examples
Single-Cell Analysis (scanpy → SingleCellProjections.jl)
using SingleCellProjections, Muon
# Load AnnData
adata = readh5ad("pbmc3k.h5ad")
# Process (10x faster than scanpy)
adata = normalize_total(adata)
adata = log1p(adata)
adata = highly_variable_genes(adata)
adata = pca(adata)
adata = umap(adata)
Quantum Circuit (qiskit → Yao.jl)
using Yao
# Bell state
circuit = chain(2, put(1=>H), control(1, 2=>X))
# Measure
result = measure(zero_state(2) |> circuit, nshots=1000)
# Differentiable!
grad = expect'(Z ⊗ Z, zero_state(2) => circuit)
Molecular GNN (deepchem → Chemellia)
using AtomicGraphNets, ChemistryFeaturization
# Featurize molecules
mol = smilestomol("CCO") # ethanol
fg = featurize(mol, GraphNodeFeaturization())
# Train GNN
model = CGCGNModel(fg, target_prop=:logP)
train!(model, molecules, targets)
Bayesian Inference (pymc → Turing.jl)
using Turing
@model function linear_regression(x, y)
α ~ Normal(0, 10)
β ~ Normal(0, 10)
σ ~ truncated(Normal(0, 1), 0, Inf)
for i in eachindex(y)
y[i] ~ Normal(α + β * x[i], σ)
end
end
chain = sample(linear_regression(x, y), NUTS(), 1000)
Full Mapping Document
The Homoiconic Bridge: Scheme ↔ SMILES ↔ ACSet
Deep structural insight: S-expressions (Scheme), SMILES strings (chemistry), and ACSets share a common foundation — trees/graphs with recursive self-reference.
Scheme S-expr: (+ (* 2 3) (- 4 1)) → AST tree
SMILES: CC(=O)Oc1ccccc1C(=O)O → Molecular graph
ACSet: Graph{V,E,src,tgt} → Typed graph functor
All three: linearized representations of graph structure
What Comes After SMILES: Learnable Chemical Structure
The evolution of molecular representation — 7 parallel streams colored via Gay.jl (seed=137):
| Gen | Color | Representation | Julia Package | Properties |
|-----|-------|----------------|---------------|------------|
| 1 | #43D9E1 | SMILES string | MolecularGraph.jl | Canonical, not learnable |
| 2 | #18CDEF | SELFIES | PyCall+selfies | Robust, generative-friendly |
| 3 | #18D6D0 | Fingerprints | MolecularGraph.jl | Fixed-dim vectors |
| 4 | #C70D22 | Graph features | ChemistryFeaturization.jl | Handcrafted node/edge |
| 5 | #E44ABB | GNN (MPNN/GAT/SchNet) | GraphNeuralNetworks.jl | Fully learnable |
| 6 | #58A021 | 3D coordinates | Chemfiles.jl, DFTK.jl | Geometry-aware |
| 7 | #BDB223 | Foundation models | Coming | Pre-trained, transferable |
Parallel Evolution Insight: Each generation evolves along its own deterministic color stream. Workers 1-3 explore the space in parallel (Strong Parallelism Invariance: same seeds = same colors).
Stream 1 (SMILES): #43D9E1 → #B78225 → #D54E82 (canonical → extended → stereochem)
Stream 2 (SELFIES): #18CDEF → #6CBA3C → #EC9426 (robust → constrained → grammar)
Stream 5 (GNN): #E44ABB → #50CD2E → #942B89 (MPNN → GAT → SchNet/DimeNet)
Stream 7 (Foundation): #BDB223 → #88ECA7 → #5CDA99 (pretrain → finetune → adapt)
# The homoiconic bridge in code
using LispSyntax, MolecularGraph, Catlab, AtomicGraphNets
# Scheme code → AST → ACSet
sexp = @lisp (defun f (x) (+ x 1))
ast_acset = ast_to_acset(sexp)
# SMILES → Molecular graph → ACSet → GNN embedding
mol = smilestomol("c1ccccc1") # benzene
mol_acset = mol_to_acset(mol)
embedding = gnn_embed(mol_acset) # 64-dim learned vector
# Both navigate identically via Specter patterns!
branches_in_ast = select([ALL, pred(is_call_node)], ast_acset)
rings_in_mol = select([ALL, pred(is_ring_atom)], mol_acset)
# The deep insight: code and molecules are both graphs
# → same tools (ACSets, GNNs) work for both
Coloring Parallel Evolution with Gay.jl
using Gay
# 7 generations evolving in parallel streams (seed=137)
struct MolRepGeneration
name::String
color::String
learnable::Bool
evolution::Vector{String} # Color stream for sub-generations
end
function color_mol_evolution(seed=137)
streams = Gay.interleave(seed, n_streams=7, count=3)
generations = [
MolRepGeneration("SMILES", streams[1][1], false, streams[1]),
MolRepGeneration("SELFIES", streams[2][1], false, streams[2]),
MolRepGeneration("Fingerprints", streams[3][1], false, streams[3]),
MolRepGeneration("GraphFeatures", streams[4][1], false, streams[4]),
MolRepGeneration("GNN", streams[5][1], true, streams[5]), # Learnable!
MolRepGeneration("3DCoords", streams[6][1], true, streams[6]),
MolRepGeneration("Foundation", streams[7][1], true, streams[7])
]
# GF(3) balance: non-learnable (-1) + transition (0) + learnable (+1) = 0
return generations
end
# Visualize evolution paths
for gen in color_mol_evolution()
trit = gen.learnable ? "+1" : "-1"
println("$(gen.name) [$(trit)]: $(join(gen.evolution, " → "))")
end
Key Julia Packages for Learnable Chemistry
| Package | Role | From Python | GNN Arch | |---------|------|-------------|----------| | MolecularGraph.jl | SMILES parsing, fingerprints | rdkit | — | | ChemistryFeaturization.jl | Node/edge featurization | deepchem | — | | GraphNeuralNetworks.jl | MPNN, GCN, GAT, GraphSAGE | torch_geometric, dgl | ✓ | | GeometricFlux.jl | Geometric deep learning | PyG | ✓ | | Flux.jl | Training infrastructure | pytorch | — | | Chemfiles.jl | 3D structure I/O | MDAnalysis | — | | DFTK.jl | Electronic structure (DFT) | pymatgen | — | | NNlib.jl | Neural network primitives | torch.nn | — |
GNN Architecture Evolution:
MPNN (2017) → GCN → GAT (attention) → SchNet (3D) → DimeNet → Equivariant GNNs
↓ ↓ ↓
Message Graph Attention Geometry-aware
Passing (multi-head) (E(3) invariant)
Related Skills
- acsets - Algebraic databases with Gay.jl coloring
- gay-julia / julia-gay - Deterministic color generation
- specter-acset - Bidirectional navigation
- structured-decomp - Sheaf-based decompositions
- condensed-analytic-stacks - Scholze-Clausen mathematics
- lispsyntax-acset - S-expression ↔ ACSet bridge
Commands
# Search Julia equivalents
julia -e 'using Pkg; Pkg.status()' | grep -i biojulia
# Install BioJulia stack
julia -e 'using Pkg; Pkg.add(["BioSequences", "FASTX", "XAM", "BioStructures"])'
# Install ML stack
julia -e 'using Pkg; Pkg.add(["Flux", "MLJ", "GraphNeuralNetworks"])'
# Install quantum stack
julia -e 'using Pkg; Pkg.add(["Yao", "QuantumToolbox"])'
GF(3) Skill Triads
julia-scientific (0) ⊗ gay-mcp (+1) ⊗ acsets (-1) = 0 ✓
julia-scientific (0) ⊗ specter-acset (+1) ⊗ structured-decomp (-1) = 0 ✓
Generated from exhaustive parallel search of Julia package ecosystem (2025-12-30)