Speaker: Marinka Zitnik (Harvard)
![notion image](https://www.notion.so/image/https%3A%2F%2Fs3.us-west-2.amazonaws.com%2Fsecure.notion-static.com%2F439f2e1f-bfd2-422c-b42b-021573c8b6da%2FUntitled.png%3FX-Amz-Algorithm%3DAWS4-HMAC-SHA256%26X-Amz-Content-Sha256%3DUNSIGNED-PAYLOAD%26X-Amz-Credential%3DAKIAT73L2G45EIPT3X45%252F20221016%252Fus-west-2%252Fs3%252Faws4_request%26X-Amz-Date%3D20221016T192401Z%26X-Amz-Expires%3D86400%26X-Amz-Signature%3Dffade7ba7d5c8a0b7291c5b39b7589ab57741fc56e9cc637c4bdcdc0df41aa39%26X-Amz-SignedHeaders%3Dhost%26x-id%3DGetObject?table=block&id=3cb627f6-42c9-495f-997b-ec3dd69df4cb&cache=v2)
What makes graph representation techniques well suited for the analysis of high-dimensional interconnected medical data?
→ Biological systems are interconnected at different scales:
- e.g. RNA-proteins-compounds-disease
- Patient networks
- Hierarchies of cell systems
- Disease pathways
- Biomenical knowledge graphs
- Gene interaction networks
- Cell-cell-similarity netowrks
![notion image](https://www.notion.so/image/https%3A%2F%2Fs3.us-west-2.amazonaws.com%2Fsecure.notion-static.com%2Fea78484d-1e0f-4b73-9ada-626437019c4e%2FUntitled.png%3FX-Amz-Algorithm%3DAWS4-HMAC-SHA256%26X-Amz-Content-Sha256%3DUNSIGNED-PAYLOAD%26X-Amz-Credential%3DAKIAT73L2G45EIPT3X45%252F20221016%252Fus-west-2%252Fs3%252Faws4_request%26X-Amz-Date%3D20221016T192401Z%26X-Amz-Expires%3D86400%26X-Amz-Signature%3Db2301af6fdf3ae68a714f70d70e6410c58ce25b0eb60d61c23a57cd9cf5911a6%26X-Amz-SignedHeaders%3Dhost%26x-id%3DGetObject?table=block&id=c41c6fe0-31eb-43c7-a05c-0abe3a9e676e&cache=v2)
Meta learning for graphs
Never-before-seen disease → we want to repurpose existing drugs as approval of new drugs time-costly.
Why is finding treatments for new disease challenging?
- Generalising to new phenomena is hard
- Prevailing GNN methods require abundant label information
- However, labeled examples are scarce
The question is then: how can we design powerful meta-learners which can transfer learning from one labeled example to others? How to make predictions on a new graph when we only have an handful of labels?
![notion image](https://www.notion.so/image/https%3A%2F%2Fs3.us-west-2.amazonaws.com%2Fsecure.notion-static.com%2F07ef438f-6c30-4e6b-a286-952ac45aa52a%2FUntitled.png%3FX-Amz-Algorithm%3DAWS4-HMAC-SHA256%26X-Amz-Content-Sha256%3DUNSIGNED-PAYLOAD%26X-Amz-Credential%3DAKIAT73L2G45EIPT3X45%252F20221016%252Fus-west-2%252Fs3%252Faws4_request%26X-Amz-Date%3D20221016T192401Z%26X-Amz-Expires%3D86400%26X-Amz-Signature%3De4ba28c6e0b2eb3cf633c310640cc1a83289706d127690109ac887a7480108c1%26X-Amz-SignedHeaders%3Dhost%26x-id%3DGetObject?table=block&id=0aac1d55-82ef-498f-9301-379a1109c944&cache=v2)
![notion image](https://www.notion.so/image/https%3A%2F%2Fs3.us-west-2.amazonaws.com%2Fsecure.notion-static.com%2F565d11e2-926a-4fa4-b729-b1019b8671e8%2FUntitled.png%3FX-Amz-Algorithm%3DAWS4-HMAC-SHA256%26X-Amz-Content-Sha256%3DUNSIGNED-PAYLOAD%26X-Amz-Credential%3DAKIAT73L2G45EIPT3X45%252F20221016%252Fus-west-2%252Fs3%252Faws4_request%26X-Amz-Date%3D20221016T192401Z%26X-Amz-Expires%3D86400%26X-Amz-Signature%3D5e0fe2b7b11037719cae98583dab75c0ac4014d8b8685bfcb22efa460b8005c6%26X-Amz-SignedHeaders%3Dhost%26x-id%3DGetObject?table=block&id=a111b968-e5b9-4b8b-adfb-c1dfc145b82a&cache=v2)
Key idea: local subgraphs - consider a distribution over subgraphs as the distribution over tasks from which a global set of parameters are learned.
Use this strategy to do link prediction.
Why are subgraphs useful:
![notion image](https://www.notion.so/image/https%3A%2F%2Fs3.us-west-2.amazonaws.com%2Fsecure.notion-static.com%2F46540040-3bc2-44b5-8a32-ce630e96c27b%2FUntitled.png%3FX-Amz-Algorithm%3DAWS4-HMAC-SHA256%26X-Amz-Content-Sha256%3DUNSIGNED-PAYLOAD%26X-Amz-Credential%3DAKIAT73L2G45EIPT3X45%252F20221016%252Fus-west-2%252Fs3%252Faws4_request%26X-Amz-Date%3D20221016T192401Z%26X-Amz-Expires%3D86400%26X-Amz-Signature%3D976a0be8614c0a3fba73ab08b418bbb85bddfa6b3883ef98b8a9902b0151f688%26X-Amz-SignedHeaders%3Dhost%26x-id%3DGetObject?table=block&id=0ab7dc44-7b20-4d14-9fb0-b6544dbad00d&cache=v2)
When labels are scarse, label propagation is not sufficient → here structure similarity is more useful.
G-Meta learns a metric to classify query subgraphs using the closest point from the support set.
COVID-19 Drug Repurposing
COVID-19 Repurposing Dataset
![notion image](https://www.notion.so/image/https%3A%2F%2Fs3.us-west-2.amazonaws.com%2Fsecure.notion-static.com%2Fe3fae32e-34a3-408c-83df-9c0ae30673b7%2FUntitled.png%3FX-Amz-Algorithm%3DAWS4-HMAC-SHA256%26X-Amz-Content-Sha256%3DUNSIGNED-PAYLOAD%26X-Amz-Credential%3DAKIAT73L2G45EIPT3X45%252F20221016%252Fus-west-2%252Fs3%252Faws4_request%26X-Amz-Date%3D20221016T192401Z%26X-Amz-Expires%3D86400%26X-Amz-Signature%3D3f6dc54b315069f0017b082bac13d5e5528d3e7a3ca329c80cec5fc07db6b0a1%26X-Amz-SignedHeaders%3Dhost%26x-id%3DGetObject?table=block&id=ba7200ce-f4c9-4175-a9b3-420b0a6bafc0&cache=v2)
What human proteins does the virus bind to? Interactions between human and protein graphs.
How to represent COVID-19? Network neighbourhood of human PPI network targeted by virus.
![One of these diseases is COVID-19. The closest drugs are displayed.](https://www.notion.so/image/https%3A%2F%2Fs3.us-west-2.amazonaws.com%2Fsecure.notion-static.com%2F2eb98525-3ec1-44b0-8d3e-a52734cd69e3%2FUntitled.png%3FX-Amz-Algorithm%3DAWS4-HMAC-SHA256%26X-Amz-Content-Sha256%3DUNSIGNED-PAYLOAD%26X-Amz-Credential%3DAKIAT73L2G45EIPT3X45%252F20221016%252Fus-west-2%252Fs3%252Faws4_request%26X-Amz-Date%3D20221016T192401Z%26X-Amz-Expires%3D86400%26X-Amz-Signature%3D98cfe367504586449c88fece693c111140b0ec1b184562f5e1524f7275eca6f6%26X-Amz-SignedHeaders%3Dhost%26x-id%3DGetObject?table=block&id=87fb91c0-df14-4aa0-b4a0-3351f115d652&cache=v2)
![TL;DR: AI-based methods are really good here!](https://www.notion.so/image/https%3A%2F%2Fs3.us-west-2.amazonaws.com%2Fsecure.notion-static.com%2F8f50be90-2182-4630-82fe-7cac8a7ce5ed%2FUntitled.png%3FX-Amz-Algorithm%3DAWS4-HMAC-SHA256%26X-Amz-Content-Sha256%3DUNSIGNED-PAYLOAD%26X-Amz-Credential%3DAKIAT73L2G45EIPT3X45%252F20221016%252Fus-west-2%252Fs3%252Faws4_request%26X-Amz-Date%3D20221016T192401Z%26X-Amz-Expires%3D86400%26X-Amz-Signature%3D23770a42f89636146a4bfb314e0e5436dbeec40bb82fc28a9db9acb40f4b7a54%26X-Amz-SignedHeaders%3Dhost%26x-id%3DGetObject?table=block&id=06472068-f10d-4ea1-b636-3c28d6a84ed1&cache=v2)
![notion image](https://www.notion.so/image/https%3A%2F%2Fs3.us-west-2.amazonaws.com%2Fsecure.notion-static.com%2F067e0d6c-fe4a-4a78-8f5f-0f0420d5fc9d%2FUntitled.png%3FX-Amz-Algorithm%3DAWS4-HMAC-SHA256%26X-Amz-Content-Sha256%3DUNSIGNED-PAYLOAD%26X-Amz-Credential%3DAKIAT73L2G45EIPT3X45%252F20221016%252Fus-west-2%252Fs3%252Faws4_request%26X-Amz-Date%3D20221016T192401Z%26X-Amz-Expires%3D86400%26X-Amz-Signature%3D664a7b2c2cdb09ae56a3515ca5785dcc0856885a0278d5c1b01c7b527aca3a54%26X-Amz-SignedHeaders%3Dhost%26x-id%3DGetObject?table=block&id=03800f5a-9de4-486b-bdee-75810ad003cc&cache=v2)
Results:
![notion image](https://www.notion.so/image/https%3A%2F%2Fs3.us-west-2.amazonaws.com%2Fsecure.notion-static.com%2F2ef7e24b-c22b-4b7c-bc8b-0003a7104699%2FUntitled.png%3FX-Amz-Algorithm%3DAWS4-HMAC-SHA256%26X-Amz-Content-Sha256%3DUNSIGNED-PAYLOAD%26X-Amz-Credential%3DAKIAT73L2G45EIPT3X45%252F20221016%252Fus-west-2%252Fs3%252Faws4_request%26X-Amz-Date%3D20221016T192401Z%26X-Amz-Expires%3D86400%26X-Amz-Signature%3D7eed463cfffcde5bb4a822a790cffcdb3c0ca71ae5ef92366228638887f41638%26X-Amz-SignedHeaders%3Dhost%26x-id%3DGetObject?table=block&id=a2fb2fb2-859a-401d-8016-ac3ec578ae26&cache=v2)
Interesting Finding: 76/77 drugs that successfully reduced viral infections do not directly bind proteins targeted by COVID.
These drugs rely on network-based actions that can's be identified by traditional docking-based strategies.
Key ML Lessons
- Domain scientists without AI expertise still need a way to interact with AI systems and need to be able to feedback in the ML loop.
- Zero-shot learning: generalising to new graphs is hard.
Between Organisms
Additional idea: how can we leverage this different type of graph transfer learning? Can we learn for humans from what we know on other organisms? G-Meta can also be used here.
![notion image](https://www.notion.so/image/https%3A%2F%2Fs3.us-west-2.amazonaws.com%2Fsecure.notion-static.com%2F5fbdd854-f5b4-42ef-96c4-51711a1387c4%2FUntitled.png%3FX-Amz-Algorithm%3DAWS4-HMAC-SHA256%26X-Amz-Content-Sha256%3DUNSIGNED-PAYLOAD%26X-Amz-Credential%3DAKIAT73L2G45EIPT3X45%252F20221016%252Fus-west-2%252Fs3%252Faws4_request%26X-Amz-Date%3D20221016T192401Z%26X-Amz-Expires%3D86400%26X-Amz-Signature%3Df08a6c837803f960831d2b38b57891b890f9bb28dd71ac7d28ab34967c7309eb%26X-Amz-SignedHeaders%3Dhost%26x-id%3DGetObject?table=block&id=24e15e10-992d-4bed-b6b2-2a761e324fa2&cache=v2)
![Key difference here: trained across many different graphs.](https://www.notion.so/image/https%3A%2F%2Fs3.us-west-2.amazonaws.com%2Fsecure.notion-static.com%2F93deded4-f57a-4500-8727-a99aef71058e%2FUntitled.png%3FX-Amz-Algorithm%3DAWS4-HMAC-SHA256%26X-Amz-Content-Sha256%3DUNSIGNED-PAYLOAD%26X-Amz-Credential%3DAKIAT73L2G45EIPT3X45%252F20221016%252Fus-west-2%252Fs3%252Faws4_request%26X-Amz-Date%3D20221016T192401Z%26X-Amz-Expires%3D86400%26X-Amz-Signature%3D9348d86bcd52265c9900b3df080bfaa99f971d392a83871b51a5bbdb5dc8ba1a%26X-Amz-SignedHeaders%3Dhost%26x-id%3DGetObject?table=block&id=459ff022-50d8-4f9f-ac5f-bc166fcbd0a3&cache=v2)
Here G-META is 29.9% over previous works and can scale to large graphs - 100x increase in graph size.
Therapeutics Data Commons
First unified framework to systematically evaluate ML across range of therapeutics. Creates new opportunities for graph-learning to be applied.
![notion image](https://www.notion.so/image/https%3A%2F%2Fs3.us-west-2.amazonaws.com%2Fsecure.notion-static.com%2F97f649bb-437b-45fb-ba0b-8351e3b91390%2FUntitled.png%3FX-Amz-Algorithm%3DAWS4-HMAC-SHA256%26X-Amz-Content-Sha256%3DUNSIGNED-PAYLOAD%26X-Amz-Credential%3DAKIAT73L2G45EIPT3X45%252F20221016%252Fus-west-2%252Fs3%252Faws4_request%26X-Amz-Date%3D20221016T192401Z%26X-Amz-Expires%3D86400%26X-Amz-Signature%3D8a7c33d87ff6d0b3fb78cb1573e678a71b94999ca919a78a93e2ab77059c06b6%26X-Amz-SignedHeaders%3Dhost%26x-id%3DGetObject?table=block&id=7cd4d678-684d-4d7b-93b5-1b8cdb932eee&cache=v2)