Research

Published

10/17/24

Current

I am a research software engineer at the Topos Institute. I work with a team developing applied category theory software in Julia. The primary application here is building tools that make scientific computing and knowledge representation easier, more transparent, and more robust to updating one’s model of the world. For an overview, see my summary talks Scientific and Software Engineering Applications of CT and Combinatorial Representations of Scientific Knowledge.

Writing code is more painful than it has to be, whether you’re a computational scientist or a software engineer. There are lots of high-level things we would like to do with code (e.g. combine functionalities of different codes, change one component or assumption without breaking everything else) that are not possible to do in practice. Part of the reason for this is that the syntax of a general purpose programming language is too powerful - one can do anything in it, which makes it hard to reason about! The abstractions we create to make our lives easier when programming (datatypes, classes, interfaces, scripts) are informed by decades of engineering know-how, yet also ad hoc / not understood mathematically at a deep level, leading to these abstractions being very fragile.

Our general value proposition is that category theory can provide a foundation for computing in virtue of allowing us to work in syntaxes which are not arbitrary code. These may be less expressive, but CT provides a formalism for giving them a computational semantics and providing us with tools to perform high-level operations in the simple syntax with guarantees that the corresponding desired things happen in the computational semantics. We also understand these simpler syntaxes well (e.g. directed graphs) and can relate them to other syntaxes (e.g. Petri nets), giving us a principled way of extending / modifying our abstractions that was previously unavailable.

Some recent projects in this vein:

Agent based models: a compositional theory of environments combined with a compositional theory of events that can happen in that environment. See this 2-page summary document and code.
Rewriting: an implementation of a general theory of patterns and replacements, allowing one to declare very general types of knowledge in a pictorial, transparent form, rather than code. (2021)
Symmetries: As described in this blog post, this project concerns extending McKay’s graph algorithm (implemented as nauty) to C-sets, so that isomorphic instances can be immediately seen to be equal. [code]
Knowledge representation: see blog post on how a database can be equipped with equipped with equations (stated in a graphical form) and how these equations are enforced.
Model Exploration: a language for composing primitive “model spaces” (for scientific models of a very wide generality) into larger ones and an computational implementation, as described in our ACT 2022 paper (2022). [code]

Past research

Knowledge representation and data integration

Declarative specification of databases

Scientists modeling complicated phenomena don’t use explicit (formally-specified) models. This is pragmatic, given current available open-source tools, but informal reasoning ulimately leads to serious challenges in communicating scientific results clearly and sharing data. A relational database backend is important for a scalable modeling tool, but a SQL-less interface is also crucial: the complexity of managing database implementation details quickly becomes unmanagable and unextensible as the model complexity increases. I helped develop a Python EDSL to help scientists generate relational databases from a natural declaration of scientific facts and to naturally query and publicly communicate their knowledge base. Our strategy is published here (2021) and here (2023), and it is being commercialized by the startup Modelyst by Michael Statt and Brian Rohr.

Heterogenous data integration, using Category Theory

There is little standardization in how data is to be represented and stored in many scientific fields. However, the varying schemas of different researchers contain significant overlap in information, and for data-driven fields it is especially beneficial to be able to freely switch from one frame of reference to another.

Furthermore, when we update our view of the world, it’s important to be able to migrate our old data, algorithms, and analysis tools into the new framework. When these tools are expressed in the language of C-sets, this migration can be automated in a verifiable way, described in Categorical data integration for computational science (2019) with computational chemistry as a case study. As stressed in the paper, these migration tools are of importance to scientists who wish to communicate and share data with lower risk of data misinterpretation. This was implemented using Categorical Query Language, a tool developed by the startup Conexus.

Development of functionals for Density Functional Theory

The simulation of chemical reactions using first-principles techniques requires a theoretical framework that is able to describe a wide range of electronic interactions. Under the direction of Johannes Voss and with Yasheng Maimaiti and Kai Trepke, I developed MCML (2021), a new meta exchange-correlation functional, with a semi-empirical approach, fitting the functional form against higher level of theory and experimental benchmark data. By using Bayesian statistics, we enabled uncertainty estimation of the computed reaction energies. This complements the earlier research I did which applied DFT to discover catalysts for sustainable energy applications ((2020), (2019), (2019), and (2018)) as well as my earlier experimental chemistry research ((2017) and (2017)).

Formal methods

I briefly rotated with the Barrett group at Stanford to learn about formal methods and made small contributions. (2021) and (2021)

I also interned at Google where we applied the Lean theorem prover. My final presentation was recorded here.

Talks

Title	Event	Links
Categorical approaches to inferentialist semantics	Topos 2024	video
ACT and Inferentialism	Methodology of Science & its Applications 2024	slides
Agent-Based Modeling via graph rewriting	GReTA 2024	video slides
Compositional modeling	Topos 2024	Python Julia
An appreciation of normativity	Topos 2023	slides
Practical abstract algebra for chemical engineers	UC Berkeley 2023	slides
A graphical language for rewriting-based programs	ACT 2023	slides
On extending mathematical attitudes to natural languages	Topos 2023	slides
Scientific and software engineering examples of applied category theory	Topos 2023	video slides
Compositional Exploration of Scientific Models	Glasgow, ACT 2022	video slides notes
Rewriting individual-based models for epidemiology	Glasgow, ACT 2022	video notes
Computational Category Theoretic Graph Rewriting	Nantes, ICGT 2022	slides
AlgebraicRewriting.jl - Declarative Data Transformation	JuliaCon 2022	video slides
Extending McKay’s Canonical Isomorph Algorithm to C-Sets	SIAM DM 2022	slides
Combinatorial Representations of Scientific Knowledge	Topos 2022	video slides
Leveraging Data to Improve the Accuracy of Chemistry Simulations	Thesis 2021	video
Formal Verification of Android Build Code	Google 2020	video

Posters

Title	Event
Applied Category Theory for Scientists	Denmark, Catalysis and Modeling Symposium 2022

CV

References

Brown, Kristopher S, Chiara Saggese, Benjamin P Le Monnier, Florent Héroguel, and Jeremy S Luterbacher. 2018. “Simulation of Gas-and Liquid-Phase Layer-by-Layer Deposition of Metal Oxides by Coarse-Grained Modeling.” The Journal of Physical Chemistry C 122 (12): 6713–20.

Brown, Kristopher S, David I Spivak, and Ryan Wisnesky. 2019. “Categorical Data Integration for Computational Science.” Computational Materials Science 164: 127–32. https://arxiv.org/pdf/1903.10579.pdf.

Brown, Kristopher, Tyler Hanks, and James Fairbanks. 2022. “Compositional Exploration of Combinatorial Scientific Models.” arXiv. https://doi.org/10.48550/ARXIV.2206.08755.

Brown, Kristopher, Yasheng Maimaiti, Kai Trepte, Thomas Bligaard, and Johannes Voss. 2021. “MCML: Combining Physical Constraints with Experimental Data for a Multi-Purpose Meta-Generalized Gradient Approximation.” Journal of Computational Chemistry 42 (28): 2004–13.

Brown, Kristopher, Evan Patterson, and James P. Fairbanks. 2021. “Computational Category-Theoretic Rewriting.” CoRR abs/2111.03784. https://arxiv.org/abs/2111.03784.

Chen, Dajing, Kaina Chen, Kristopher Brown, Annie Hang, and John XJ Zhang. 2017. “Liquid-Phase Tuning of Porous PVDF-TrFE Film on Flexible Substrate for Energy Harvesting.” Applied Physics Letters 110 (15): 153902.

Héroguel, Florent, Benjamin P Le Monnier, Kristopher S Brown, Juno C Siu, and Jeremy S Luterbacher. 2017. “Catalyst Stabilization by Stoichiometrically Limited Layer-by-Layer Overcoating in Liquid Media.” Applied Catalysis B: Environmental 218: 643–49.

Liu, Xinyan, Philomena Schlexer, Jianping Xiao, Yongfei Ji, Lei Wang, Robert B Sandberg, Michael Tang, et al. 2019. “pH Effects on the Electrochemical Reduction of CO (2) Towards C2 Products on Stepped Copper.” Nature Communications 10 (1): 1–10.

Ludwig, Thomas, Joseph A Gauthier, Kristopher S Brown, Stefan Ringe, Jens K Nørskov, and Karen Chan. 2019. “Solvent–Adsorbate Interactions and Adsorbate-Specific Solvent Structure in Carbon Dioxide Reduction on a Stepped Cu Surface.” The Journal of Physical Chemistry C 123 (10): 5999–6009.

Ludwig, Thomas, Joseph A Gauthier, Colin F Dickens, Kristopher S Brown, Stefan Ringe, Karen Chan, and Jens K Nørskov. 2020. “Atomistic Insight into Cation Effects on Binding Energies in Cu-Catalyzed Carbon Dioxide Reduction.” The Journal of Physical Chemistry C 124 (45): 24765–75.

Mann, Makai, Ahmed Irfan, Florian Lonsing, Yahan Yang, Hongce Zhang, Kristopher Brown, Aarti Gupta, and Clark Barrett. 2021. “Pono: A Flexible and Extensible SMT-Based Model Checker.” In International Conference on Computer Aided Verification, 461–74. Springer.

Mann, Makai, Amalee Wilson, Yoni Zohar, Lindsey Stuntz, Ahmed Irfan, Kristopher Brown, Caleb Donovick, Allison Guman, Cesare Tinelli, and Clark Barrett. 2021. “SMT-Switch: A Solver-Agnostic c++ API for SMT Solving.” In International Conference on Theory and Applications of Satisfiability Testing, 377–86. Springer.

Statt, Michael J, Brian A Rohr, Kris Brown, Dan Guevarra, Jens Hummelshøj, Linda Hung, Abraham Anapolsky, John M Gregoire, and Santosh K Suram. 2023. “ESAMP: Event-Sourced Architecture for Materials Provenance Management and Application to Accelerated Materials Discovery.” Digital Discovery 2 (4): 1078–88.

Statt, Michael, Kristopher Brown, Santosh Suram, Linda Hung, Daniel Schweigert, John Gregoire, and Brian Rohr. 2021. “DBgen: A Python Library for Defining Scalable, Maintainable, Accessible, Reconfigurable, Transparent (SMART) Data Pipelines.”