OpenAI가 생물학에 맞춰 조정된 LLM 제공을 시작합니다.
Ars Technica
|
|
{'이벤트': '📰', '머신러닝/연구': '📰', '하드웨어/반도체': '📰', '취약점/보안': '📰', '기타 AI': '📰', 'AI 딜': '📰', 'AI 모델': '📰', 'AI 서비스': '📰', 'discount': '📰', 'news': '📰', 'review': '📰', 'tip': '📰'} 머신러닝/연구
#ai 모델
#openai
#머신러닝/연구
요약
OpenAI는 로잘린드 프랭클린의 이름을 딴 ‘GPT-Rosalind’라는 생물학 전용 대규모 언어 모델을 발표했습니다. 이 모델은 50가지의 일반적인 생물학 워크플로우와 주요 공공 데이터베이스 활용법을 학습하여 방대한 데이터 이해와 전문 분야 간의 장벽을 해소하는 데 중점을 두었습니다. 또한 GPT-Rosalind는 유전형과 표현형을 연결하고 잠재적 약물 타겟을 우선순위에 따라 제안하는 등 생물학적 경로를 추론하는 데 활용될 수 있습니다.
왜 중요한가
관련 엔티티
OpenAI
GPT-Rosalind
로잘린드 프랭클린
본문
On Thursday, OpenAI announced it had developed a large language model specifically trained on common biology workflows. Called GPT-Rosalind after Rosalind Franklin, the model appears to differ from most science-focused models from major tech companies, which have generally taken a more generic approach that works for various fields. In a press briefing, Yunyun Wang, OpenAI's Life Sciences Product Lead, said the system was designed to tackle two major roadblocks faced by current biology researchers. One is the massive datasets created by decades of genome sequencing and protein biochemistry, which can be too much for any one researcher to take in. The second is that biology has many highly specialized subfields, each with its own techniques and jargon. So, for example, a geneticist who finds themselves working on a gene that's active in brain cells might struggle to understand the immense neurobiological literature. Wang said the company had taken an LLM and trained it on 50 of the most common biological workflows, as well as on how to access the major public databases of biological information. Further training has resulted in a system that can suggest likely biological pathways and prioritize potential drug targets. "We're connecting genotype to phenotype through known pathways and regulatory mechanisms, infer likely structural or functional properties of proteins, and really leveraging this mechanistic understanding," Wang said.Read full article Comments