Posts by Tags

antibody design

Developments in machine learning for antibody design

23 minute read

Published:

Protein structure and sequence modeling has seen a fresh wave of resurgence in the last couple of years owing to some interesting developments in machine learning (ML) and deep learning (DL) based techniques. These techniques appear in a variety of flavours including using Equivariant neural network modules to respect the structural properties of 3D macromolecules, deeper networks that can benefit from the increased available experimental structures, powerful node-to-node relationship learners like transformers, and masked language modeling on the protein sequence space to learn evolutionary information. While structure prediction methods like AlphaFold (AF) [1] and RosettaFold (RF) [2] have become ubiquitious in computational structural biology, there remain challenges to be tackled on multiple fronts, where ML will play an important role.

artificial intelligence

Enhancing Factual Accuracy in Large Language Models: Integrating Decoding Strategies and Model Steering

19 minute read

Published:

The emergence of open-source Large Language Models (LLMs) like Llama has revolutionized natural language generation (NLG), making advanced conversational AI accessible to a broader audience [1]. Despite their impressive capabilities, these models often grapple with a significant challenge: factual hallucinations. Factual hallucinations occur when an AI model generates content that is unfaithful to the source material or cannot be verified against reliable data [2]. This issue is particularly concerning in critical and information-dense fields such as health, law, finance, and education, where misinformation can have catastrophic consequences [3][4].

Perspectives on the future of AI

13 minute read

Published:

How big are the models going to get and how much longer is the scaling hypothesis going to hold? It’s unclear, but according to current performance trends, which haven’t shown signs of plateauing (GPT-4o, Claude 3.5 Sonnet, Gemini-1.5-Pro, Llama-3.1-405B, Grok-2), and the power budget of announced data centres (5GW OpenAI/Microsoft Stargate campus), it is likely that there is an order of magnitude left (OOM) to climb in model size. This Epoch AI research covers these scenarios in depth and estimates training runs of the order of ~2e29 FLOPs being possible by 2030, which would be 4 OOMs larger than GPT-4 (2e25 FLOPs). These training runs will primarily be power constrained, followed by chips, data, and latency.

Developments in machine learning for antibody design

23 minute read

Published:

Protein structure and sequence modeling has seen a fresh wave of resurgence in the last couple of years owing to some interesting developments in machine learning (ML) and deep learning (DL) based techniques. These techniques appear in a variety of flavours including using Equivariant neural network modules to respect the structural properties of 3D macromolecules, deeper networks that can benefit from the increased available experimental structures, powerful node-to-node relationship learners like transformers, and masked language modeling on the protein sequence space to learn evolutionary information. While structure prediction methods like AlphaFold (AF) [1] and RosettaFold (RF) [2] have become ubiquitious in computational structural biology, there remain challenges to be tackled on multiple fronts, where ML will play an important role.

biologics

Developments in machine learning for antibody design

23 minute read

Published:

Protein structure and sequence modeling has seen a fresh wave of resurgence in the last couple of years owing to some interesting developments in machine learning (ML) and deep learning (DL) based techniques. These techniques appear in a variety of flavours including using Equivariant neural network modules to respect the structural properties of 3D macromolecules, deeper networks that can benefit from the increased available experimental structures, powerful node-to-node relationship learners like transformers, and masked language modeling on the protein sequence space to learn evolutionary information. While structure prediction methods like AlphaFold (AF) [1] and RosettaFold (RF) [2] have become ubiquitious in computational structural biology, there remain challenges to be tackled on multiple fronts, where ML will play an important role.

climate change

Are we explorers or caretakers?

7 minute read

Published:

This was written when I was younger, and both the content and the form of my opinions on this topic have changed since then. Leaving this here for the sake of continuity.

climate sensitivity

critical mineral

The need for a critical mineral demand model incorporating technical change

17 minute read

Published:

Introduction

Studying the effects of technical change on critical mineral demand and supply in the context of the low-carbon energy transition is an important and open area of research. Despite the crucial role played by these minerals in low-carbon technologies, long-term demand projections remain uncertain due to intricate interactions between drivers of technical change. In this writeup, I lay out what a framework that studies the effects of technical change on critical mineral demand would look like, how it can be developed, and what are its potential use cases.

demand models

The need for a critical mineral demand model incorporating technical change

17 minute read

Published:

Introduction

Studying the effects of technical change on critical mineral demand and supply in the context of the low-carbon energy transition is an important and open area of research. Despite the crucial role played by these minerals in low-carbon technologies, long-term demand projections remain uncertain due to intricate interactions between drivers of technical change. In this writeup, I lay out what a framework that studies the effects of technical change on critical mineral demand would look like, how it can be developed, and what are its potential use cases.

environment

Are we explorers or caretakers?

7 minute read

Published:

This was written when I was younger, and both the content and the form of my opinions on this topic have changed since then. Leaving this here for the sake of continuity.

generalization

Perspectives on the future of AI

13 minute read

Published:

How big are the models going to get and how much longer is the scaling hypothesis going to hold? It’s unclear, but according to current performance trends, which haven’t shown signs of plateauing (GPT-4o, Claude 3.5 Sonnet, Gemini-1.5-Pro, Llama-3.1-405B, Grok-2), and the power budget of announced data centres (5GW OpenAI/Microsoft Stargate campus), it is likely that there is an order of magnitude left (OOM) to climb in model size. This Epoch AI research covers these scenarios in depth and estimates training runs of the order of ~2e29 FLOPs being possible by 2030, which would be 4 OOMs larger than GPT-4 (2e25 FLOPs). These training runs will primarily be power constrained, followed by chips, data, and latency.

global warming

hallucinations

Enhancing Factual Accuracy in Large Language Models: Integrating Decoding Strategies and Model Steering

19 minute read

Published:

The emergence of open-source Large Language Models (LLMs) like Llama has revolutionized natural language generation (NLG), making advanced conversational AI accessible to a broader audience [1]. Despite their impressive capabilities, these models often grapple with a significant challenge: factual hallucinations. Factual hallucinations occur when an AI model generates content that is unfaithful to the source material or cannot be verified against reliable data [2]. This issue is particularly concerning in critical and information-dense fields such as health, law, finance, and education, where misinformation can have catastrophic consequences [3][4].

imaginative

A time capsule

less than 1 minute read

Published:

In progress

interpretability

Enhancing Factual Accuracy in Large Language Models: Integrating Decoding Strategies and Model Steering

19 minute read

Published:

The emergence of open-source Large Language Models (LLMs) like Llama has revolutionized natural language generation (NLG), making advanced conversational AI accessible to a broader audience [1]. Despite their impressive capabilities, these models often grapple with a significant challenge: factual hallucinations. Factual hallucinations occur when an AI model generates content that is unfaithful to the source material or cannot be verified against reliable data [2]. This issue is particularly concerning in critical and information-dense fields such as health, law, finance, and education, where misinformation can have catastrophic consequences [3][4].

large language models

Enhancing Factual Accuracy in Large Language Models: Integrating Decoding Strategies and Model Steering

19 minute read

Published:

The emergence of open-source Large Language Models (LLMs) like Llama has revolutionized natural language generation (NLG), making advanced conversational AI accessible to a broader audience [1]. Despite their impressive capabilities, these models often grapple with a significant challenge: factual hallucinations. Factual hallucinations occur when an AI model generates content that is unfaithful to the source material or cannot be verified against reliable data [2]. This issue is particularly concerning in critical and information-dense fields such as health, law, finance, and education, where misinformation can have catastrophic consequences [3][4].

perspective

Perspectives on the future of AI

13 minute read

Published:

How big are the models going to get and how much longer is the scaling hypothesis going to hold? It’s unclear, but according to current performance trends, which haven’t shown signs of plateauing (GPT-4o, Claude 3.5 Sonnet, Gemini-1.5-Pro, Llama-3.1-405B, Grok-2), and the power budget of announced data centres (5GW OpenAI/Microsoft Stargate campus), it is likely that there is an order of magnitude left (OOM) to climb in model size. This Epoch AI research covers these scenarios in depth and estimates training runs of the order of ~2e29 FLOPs being possible by 2030, which would be 4 OOMs larger than GPT-4 (2e25 FLOPs). These training runs will primarily be power constrained, followed by chips, data, and latency.

space exploration

Are we explorers or caretakers?

7 minute read

Published:

This was written when I was younger, and both the content and the form of my opinions on this topic have changed since then. Leaving this here for the sake of continuity.

technical change

The need for a critical mineral demand model incorporating technical change

17 minute read

Published:

Introduction

Studying the effects of technical change on critical mineral demand and supply in the context of the low-carbon energy transition is an important and open area of research. Despite the crucial role played by these minerals in low-carbon technologies, long-term demand projections remain uncertain due to intricate interactions between drivers of technical change. In this writeup, I lay out what a framework that studies the effects of technical change on critical mineral demand would look like, how it can be developed, and what are its potential use cases.

time capsule

A time capsule

less than 1 minute read

Published:

In progress