<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://joonlab.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://joonlab.github.io/" rel="alternate" type="text/html" /><updated>2026-01-11T20:26:03+09:00</updated><id>https://joonlab.github.io/feed.xml</id><title type="html">준랩 | JoonLab</title><subtitle>준랩의 블로그입니다</subtitle><author><name>Park Joon</name></author><entry><title type="html">나노바나나 프로(Nano Banana Pro) 실제 사례 가이드</title><link href="https://joonlab.github.io/coding/nano-banana-pro-use-cases/" rel="alternate" type="text/html" title="나노바나나 프로(Nano Banana Pro) 실제 사례 가이드" /><published>2026-01-11T00:00:00+09:00</published><updated>2026-01-11T00:00:00+09:00</updated><id>https://joonlab.github.io/coding/nano-banana-pro-use-cases</id><content type="html" xml:base="https://joonlab.github.io/coding/nano-banana-pro-use-cases/"><![CDATA[<blockquote>
  <p><strong>작성일</strong>: 2026년 1월 11일
<strong>주제</strong>: 구글의 AI 이미지 생성 도구 ‘나노바나나 프로’의 실제 활용 사례 및 분석</p>
</blockquote>

<hr />

<h2 id="1-나노바나나-프로란">1. 나노바나나 프로란?</h2>

<p><strong>나노바나나 프로(Nano Banana Pro)</strong>는 구글 딥마인드가 2025년 11월 20일 출시한 최신 AI 이미지 생성 및 편집 모델입니다. 공식 명칭은 <strong>Gemini 3 Pro Image</strong>이며, 기존 나노바나나(Gemini 2.5 Flash Image 기반)를 대폭 업그레이드한 버전입니다.</p>

<p><img src="https://raw.githubusercontent.com/joonlab/md-share-db/main/images/google-blog-hero_402109.png" alt="나노바나나 프로 공식 소개" /></p>

<blockquote>
  <p>출처: <a href="https://blog.google/technology/ai/nano-banana-pro/">Google Blog - Introducing Nano Banana Pro</a></p>
</blockquote>

<h3 id="탄생-배경">탄생 배경</h3>

<p>2025년 8월, AI 이미지 생성 모델 비교 사이트인 <strong>LM Arena</strong>에 정체불명의 “nano-banana”라는 모델이 등장했습니다. 이 모델은 기존 이미지 생성 모델들을 압도적으로 능가하는 품질과 사물 이해력으로 큰 화제를 모았고, 이후 구글이 공식적으로 정체를 밝히며 출시했습니다.</p>

<h3 id="핵심-특징">핵심 특징</h3>

<table>
  <thead>
    <tr>
      <th>기능</th>
      <th>설명</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>Perfect Text Rendering</strong></td>
      <td>다국어로 선명하고 읽기 쉬운 텍스트 구현</td>
    </tr>
    <tr>
      <td><strong>Thinking Mode</strong></td>
      <td>AI가 렌더링 전 구성을 계획하여 최적화된 결과 생성</td>
    </tr>
    <tr>
      <td><strong>2K/4K 해상도</strong></td>
      <td>고해상도 출력 지원</td>
    </tr>
    <tr>
      <td><strong>14개 참조 이미지 융합</strong></td>
      <td>스타일, 로고, 캐릭터 얼굴 학습 가능</td>
    </tr>
    <tr>
      <td><strong>5명 캐릭터 일관성</strong></td>
      <td>최대 5명의 인물 외형 특징 유지</td>
    </tr>
    <tr>
      <td><strong>구글 검색 통합</strong></td>
      <td>실시간 정보 기반 이미지 생성</td>
    </tr>
  </tbody>
</table>

<p><img src="https://raw.githubusercontent.com/joonlab/md-share-db/main/images/google-blog-examples1_7d4b6e.png" alt="나노바나나 프로 기능 소개" /></p>

<blockquote>
  <p>출처: <a href="https://blog.google/technology/ai/nano-banana-pro/">Google Blog - Introducing Nano Banana Pro</a></p>
</blockquote>

<hr />

<h2 id="2-다국어-텍스트-렌더링-정확도">2. 다국어 텍스트 렌더링 정확도</h2>

<p>나노바나나 프로의 가장 강력한 기능 중 하나는 <strong>이미지 내 텍스트 렌더링</strong>입니다.</p>

<h3 id="언어별-정확도">언어별 정확도</h3>

<table>
  <thead>
    <tr>
      <th>언어</th>
      <th>정확도</th>
      <th>비고</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>영어</td>
      <td>94%</td>
      <td>가장 높은 정확도</td>
    </tr>
    <tr>
      <td>한국어</td>
      <td>90%</td>
      <td>한글의 체계적 구조 덕분에 높은 성능</td>
    </tr>
    <tr>
      <td>중국어</td>
      <td>88%</td>
      <td>복잡한 획수 문자에서 오류 발생</td>
    </tr>
    <tr>
      <td>일본어</td>
      <td>85%</td>
      <td>히라가나 &gt; 가타카나 &gt; 한자 순 정확도</td>
    </tr>
  </tbody>
</table>

<h3 id="실제-사례-다국어-간판-변환">실제 사례: 다국어 간판 변환</h3>

<p>영어로 된 <strong>STOP 표지판</strong>을 한국어 <strong>‘정지’ 간판</strong>으로 변환하는 테스트에서:</p>
<ul>
  <li>배경, 조명, 구도는 그대로 유지</li>
  <li>텍스트만 깔끔하게 교체</li>
  <li>기존 AI 모델에서 불가능했던 작업이 자연스럽게 구현</li>
</ul>

<hr />

<h2 id="3-실제-활용-사례">3. 실제 활용 사례</h2>

<h3 id="24가지-실제-활용-사례-개요">24가지 실제 활용 사례 개요</h3>

<p><img src="https://raw.githubusercontent.com/joonlab/md-share-db/main/images/imagineart-usecases1_07c4d1.png" alt="ImagineArt 24가지 사례" /></p>

<blockquote>
  <p>출처: <a href="https://www.imagine.art/blogs/nano-banana-pro-use-cases">ImagineArt - 24 Mind-Blowing Nano Banana Pro Use Cases</a></p>
</blockquote>

<h3 id="31-마케팅-콘텐츠-제작">3.1 마케팅 콘텐츠 제작</h3>

<h4 id="소셜-미디어-그래픽">소셜 미디어 그래픽</h4>
<ul>
  <li><strong>LinkedIn, Instagram, TikTok</strong>용 시각 콘텐츠 제작</li>
  <li>일반 스톡 사진 대신 브랜드 맞춤형 비주얼 생성</li>
  <li>게시물 각도에 맞춘 정확한 시각화</li>
</ul>

<p><strong>활용 효과</strong>: 소셜 미디어 마케팅팀이 단일 캠페인 비주얼에서 <strong>다양한 포맷을 자동 생성</strong>하여 게시물 준비 시간을 <strong>75% 단축</strong></p>

<h4 id="유튜브-썸네일">유튜브 썸네일</h4>
<ul>
  <li>클릭률(CTR)을 결정짓는 핵심 요소</li>
  <li>영상의 주제와 분위기를 즉시 전달하는 강렬한 구도</li>
  <li>대담한 비주얼 구성 자동화</li>
</ul>

<hr />

<h3 id="32-이커머스--쇼핑몰-활용">3.2 이커머스 &amp; 쇼핑몰 활용</h3>

<h4 id="상세페이지-자동화">상세페이지 자동화</h4>

<p><strong>핵심 워크플로우</strong>:</p>
<ol>
  <li>제품 사진과 기능 요약, 가격 정보를 하나의 페이지에 배치</li>
  <li>포토샵 없이 쇼핑몰용 이미지 제작</li>
  <li>지역 특산품의 품종별 특성을 이미지로 시각화</li>
</ol>

<p><strong>실제 성과</strong>:</p>
<ul>
  <li>상세페이지 이미지 <strong>1분 이내 생성</strong></li>
  <li>기획과 제작 시간 <strong>대폭 단축</strong></li>
  <li>제품 페이지 전환율 <strong>15-30% 향상</strong> (보고된 사례 기준)</li>
</ul>

<h4 id="제품-사진-편집">제품 사진 편집</h4>

<table>
  <thead>
    <tr>
      <th>작업 유형</th>
      <th>설명</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>배경 교체</td>
      <td>제품 주위 배경을 오피스 공간, 야외 등으로 변경</td>
    </tr>
    <tr>
      <td>먼지/소품 제거</td>
      <td>상업용 제품 사진 정리</td>
    </tr>
    <tr>
      <td>조명 조화</td>
      <td>제품과 배경의 조명 통일</td>
    </tr>
    <tr>
      <td>라이프스타일 합성</td>
      <td>제품 주변에 믿을 수 있는 환경 생성</td>
    </tr>
  </tbody>
</table>

<hr />

<h3 id="33-인포그래픽-제작">3.3 인포그래픽 제작</h3>

<h4 id="특징">특징</h4>
<ul>
  <li>구글 검색 연동으로 <strong>실제 정보 기반</strong> 시각화</li>
  <li>복잡한 내용을 아이콘, 도형, 텍스트로 정리</li>
  <li>여행 가이드, 블로그, 인스타그램에 최적화</li>
</ul>

<h4 id="실제-사례-식물-인포그래픽">실제 사례: 식물 인포그래픽</h4>

<p><img src="https://raw.githubusercontent.com/joonlab/md-share-db/main/images/google-blog-examples2_209672.png" alt="인포그래픽 제작 사례" /></p>

<blockquote>
  <p>실제 사례: “String of Turtles” 식물 사진 → Leaf Pattern, Origin &amp; Habitat, Growth Habit, Care Essentials 등 실제 정보 기반 인포그래픽 자동 생성
출처: <a href="https://blog.google/technology/ai/nano-banana-pro/">Google Blog</a></p>
</blockquote>

<p><strong>프롬프트 예시</strong>: <code class="language-plaintext highlighter-rouge">Create an infographic about this plant focusing on interesting information.</code></p>

<p>이 예시에서 나노바나나 프로는:</p>
<ul>
  <li>식물의 실제 학명(Peperomia prostrata) 검색</li>
  <li>원산지, 성장 습성, 관리 방법 등 <strong>실제 정보 기반</strong> 시각화</li>
  <li>깔끔한 레이아웃과 아이콘 자동 배치</li>
</ul>

<h4 id="사용자-평가">사용자 평가</h4>
<blockquote>
  <p>“철자 오류 없이 인포그래픽을 완성했다”
“레스토랑 메뉴를 완벽한 레이아웃과 타이포그래피로 한 번에 생성했다”</p>
</blockquote>

<hr />

<h3 id="34-스토리보드--영상-제작">3.4 스토리보드 &amp; 영상 제작</h3>

<h4 id="스토리보드-자동-생성">스토리보드 자동 생성</h4>

<p><img src="https://raw.githubusercontent.com/joonlab/md-share-db/main/images/google-blog-examples3_cecc4a.png" alt="스토리보드 제작 사례" /></p>

<blockquote>
  <p>실제 사례: 우주비행사 이미지 한 장 → Establishing Shot, Medium Shot, Close-up, POV Shot 등 영화 촬영 기법에 맞춘 스토리보드 자동 생성
출처: <a href="https://blog.google/technology/ai/nano-banana-pro/">Google Blog</a></p>
</blockquote>

<p><strong>프롬프트 예시</strong>: <code class="language-plaintext highlighter-rouge">Create a storyboard for this scene</code></p>

<p>이 기능은 다음과 같은 분야에서 활용됩니다:</p>
<ul>
  <li><strong>영화/드라마 프리프로덕션</strong>: 촬영 전 장면 구성 시각화</li>
  <li><strong>광고 제작</strong>: 광고 콘티 빠르게 제작</li>
  <li><strong>유튜브 콘텐츠</strong>: 영상 기획 단계에서 활용</li>
</ul>

<hr />

<h3 id="35-브랜드-일관성-유지">3.5 브랜드 일관성 유지</h3>

<h4 id="기능">기능</h4>
<ul>
  <li>최대 <strong>14개 참조 이미지</strong> 업로드로 스타일 학습</li>
  <li>로고, 컬러 팔레트, 제품 샷 적용</li>
  <li>기업 아이덴티티와 캠페인 기준에 맞춘 자산 생성</li>
</ul>

<h4 id="활용-분야">활용 분야</h4>
<ul>
  <li><strong>코믹북 제작자</strong>: 50개 다른 포즈에서도 캐릭터 얼굴 유지</li>
  <li><strong>스토리보드 아티스트</strong>: 일관된 캐릭터로 시퀀스 제작</li>
  <li><strong>마케팅 캠페인</strong>: 반복되는 캐릭터/마스코트 자산 유지</li>
</ul>

<hr />

<h3 id="36-글로벌다국어-캠페인">3.6 글로벌/다국어 캠페인</h3>

<p><img src="https://raw.githubusercontent.com/joonlab/md-share-db/main/images/google-korea-blog2_c00c14.png" alt="구글 코리아 블로그" /></p>

<blockquote>
  <p>출처: <a href="https://blog.google/intl/ko-kr/company-news/technology/nano-banana-pro/">구글 코리아 블로그 - 나노바나나 프로 소개</a></p>
</blockquote>

<h4 id="기능-1">기능</h4>
<ul>
  <li>이미지 내 텍스트 번역 및 렌더링 지원</li>
  <li>마케팅 자료, 제품 패키징, 프로모션 비주얼 <strong>즉시 현지화</strong></li>
</ul>

<h4 id="지원-언어">지원 언어</h4>
<p>영어, 중국어(간체/번체), 일본어, 한국어, 프랑스어, 독일어, 스페인어, 이탈리아어, 포르투갈어, 러시아어, 아랍어 등</p>

<hr />

<h2 id="4-전문가-리뷰--고급-프롬프트-기법">4. 전문가 리뷰 &amp; 고급 프롬프트 기법</h2>

<h3 id="max-woolf의-심층-리뷰">Max Woolf의 심층 리뷰</h3>

<p><img src="https://raw.githubusercontent.com/joonlab/md-share-db/main/images/maxwoolf-review1_981528.png" alt="Max Woolf 리뷰" /></p>

<blockquote>
  <p>출처: <a href="https://minimaxir.com/2025/12/nano-banana-pro/">Max Woolf’s Blog - “Nano Banana Pro is the best AI image generator, with caveats”</a></p>
</blockquote>

<h3 id="고급-프롬프트-엔지니어링">고급 프롬프트 엔지니어링</h3>

<p><img src="https://raw.githubusercontent.com/joonlab/md-share-db/main/images/maxwoolf-review2_ddaaf3.png" alt="복잡한 프롬프트 예시" /></p>

<blockquote>
  <p>복잡한 제약 조건을 가진 프롬프트도 정확하게 처리: 3마리 고양이의 색상, 포즈, 의상, 배경, 조명까지 모두 지정
출처: <a href="https://minimaxir.com/2025/12/nano-banana-pro/">Max Woolf’s Blog</a></p>
</blockquote>

<p>위 예시에서 사용된 프롬프트:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Create an image featuring three specific kittens in three specific positions.

All of the kittens MUST follow these descriptions EXACTLY:
- Left: a kitten with prominent black-and-silver fur, wearing both blue denim overalls and a blue plain denim baseball hat.
- Middle: a kitten with prominent white-and-gold fur and prominent gold-colored long goatee facial hair, wearing a 24k-carat golden monocle.
- Right: a kitten with prominent #9F2B68-and-#00FF00 fur, wearing a San Francisco Giants sports jersey.

Aspects of the image composition that MUST be followed EXACTLY:
- All kittens MUST be positioned according to the "rule of thirds" both horizontally and vertically.
- All kittens MUST lay prone, facing the camera.
- All kittens MUST have heterochromatic eye colors matching their two specified fur colors.
- The image is shot on top of a bed in a multimillion-dollar Victorian mansion.
- The image is a Pulitzer Prize winning cover photo for The New York Times with neutral diffuse 3PM lighting for both the subjects and background that complement each other.
- NEVER include any text, watermarks, or line overlays.
</code></pre></div></div>

<p><strong>핵심 포인트</strong>: 나노바나나 프로는 이러한 복잡한 제약 조건들을 모두 정확하게 반영하여 이미지를 생성할 수 있습니다.</p>

<hr />

<h2 id="5-추천-프롬프트-예시">5. 추천 프롬프트 예시</h2>

<p><img src="https://raw.githubusercontent.com/joonlab/md-share-db/main/images/prompt-guide1_aaf802.png" alt="프롬프트 가이드" /></p>

<blockquote>
  <p>출처: <a href="https://www.imagine.art/blogs/nano-banana-pro-prompt-guide">ImagineArt - Nano Banana Pro Prompt Guide</a></p>
</blockquote>

<h3 id="피규어-제작">피규어 제작</h3>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Draw a prospective model of the character in the picture,
commercialized as a 1/7 scale figure.
</code></pre></div></div>

<h3 id="3d-치비-피규어">3D 치비 피규어</h3>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>3D chibi figurine of character in action pose, glossy vinyl toy finish,
soft pastel color palette, studio lighting with gentle rim light,
floating on solid color background, Funko Pop style,
rendered in Blender quality, 4K detail
</code></pre></div></div>

<h3 id="상세페이지-제작">상세페이지 제작</h3>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>제품 주위 배경을 넓게 확장한 뒤에 오피스 공간으로 바꿔줘.
그리고 제품 위에 핀 조명이 은은하게 비췄으면 좋겠어.
</code></pre></div></div>

<h3 id="인포그래픽-제작">인포그래픽 제작</h3>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Create an infographic about this plant focusing on interesting information.
</code></pre></div></div>

<h3 id="스토리보드-제작">스토리보드 제작</h3>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Create a storyboard for this scene
</code></pre></div></div>

<h3 id="이미지-편집">이미지 편집</h3>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In this picture, remove the person on the left.
</code></pre></div></div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Change the background of this image to a beach at sunset.
</code></pre></div></div>

<hr />

<h2 id="6-가격-및-이용-방법">6. 가격 및 이용 방법</h2>

<h3 id="가격표">가격표</h3>

<table>
  <thead>
    <tr>
      <th>해상도</th>
      <th>가격(장당)</th>
      <th>비고</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>1K</td>
      <td>$0.134</td>
      <td>GPT-Image-1보다 저렴</td>
    </tr>
    <tr>
      <td>2K</td>
      <td>$0.134</td>
      <td> </td>
    </tr>
    <tr>
      <td>4K</td>
      <td>$0.24</td>
      <td>프리미엄 품질</td>
    </tr>
  </tbody>
</table>

<h3 id="요금제별-워터마크-정책">요금제별 워터마크 정책</h3>

<table>
  <thead>
    <tr>
      <th>요금제</th>
      <th>워터마크</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>무료</td>
      <td>우측 하단 반짝이(Sparkle) 마크</td>
    </tr>
    <tr>
      <td>프로(Pro)</td>
      <td>우측 하단 반짝이(Sparkle) 마크</td>
    </tr>
    <tr>
      <td>울트라(Ultra)</td>
      <td><strong>워터마크 없음</strong></td>
    </tr>
  </tbody>
</table>

<h3 id="접근-방법">접근 방법</h3>

<ol>
  <li><strong>Gemini 앱</strong>: Create Image → “Thinking with 3 Pro” 선택</li>
  <li><strong>Google AI Studio</strong>: API 접근</li>
  <li><strong>Vertex AI</strong>: 기업용 API 통합</li>
  <li><strong>제3자 플랫폼</strong>: ImagineArt, Adobe Creative Cloud</li>
</ol>

<h3 id="무료-체험-방법">무료 체험 방법</h3>
<ul>
  <li><strong>LM Arena</strong> 사이트에서 100% 무료 테스트 가능</li>
  <li>Gemini 무료 버전: 하루 1~3개 생성 가능</li>
</ul>

<hr />

<h2 id="7-장점과-한계">7. 장점과 한계</h2>

<h3 id="장점">장점</h3>

<table>
  <thead>
    <tr>
      <th>항목</th>
      <th>설명</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>텍스트 정확도</td>
      <td>다국어 텍스트 렌더링 업계 최고 수준</td>
    </tr>
    <tr>
      <td>합성 능력</td>
      <td>자연스러운 이미지 합성 및 편집</td>
    </tr>
    <tr>
      <td>일관성</td>
      <td>다중 이미지에서 캐릭터/브랜드 일관성 유지</td>
    </tr>
    <tr>
      <td>속도</td>
      <td>경쟁 모델 대비 빠른 생성 속도</td>
    </tr>
    <tr>
      <td>지식 통합</td>
      <td>구글 검색 연동으로 정확한 정보 기반 생성</td>
    </tr>
  </tbody>
</table>

<h3 id="한계">한계</h3>

<table>
  <thead>
    <tr>
      <th>항목</th>
      <th>설명</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>한글 완벽도</td>
      <td>약 90% (100% 아님)</td>
    </tr>
    <tr>
      <td>사용량 제한</td>
      <td>Pro 플랜도 20~30개에서 제한되는 경우 보고</td>
    </tr>
    <tr>
      <td>순수 생성 품질</td>
      <td>Text-to-image가 Image-to-image보다 다소 낮음</td>
    </tr>
    <tr>
      <td>정확한 정보</td>
      <td>지도, 행정구역 등 정확성 요구 요소는 확인 필요</td>
    </tr>
  </tbody>
</table>

<hr />

<h2 id="8-추천-대상">8. 추천 대상</h2>

<h3 id="강력-추천">강력 추천</h3>
<ul>
  <li>이커머스 운영자</li>
  <li>마케팅 에이전시</li>
  <li>콘텐츠 크리에이터 (유튜버, 블로거)</li>
  <li>제품 사진 리소스 부족한 브랜드</li>
  <li>데이터 시각화/인포그래픽 제작자</li>
  <li>글로벌 다국어 캠페인 운영 팀</li>
</ul>

<h3 id="사용-시-주의">사용 시 주의</h3>
<ul>
  <li>한글 텍스트 최종 확인 필수</li>
  <li>정확한 정보(지도, 수치 등) 별도 검증 필요</li>
  <li>복잡한 합성 작업 시 프롬프트 보정 고려</li>
</ul>

<hr />

<h2 id="9-결론">9. 결론</h2>

<p>나노바나나 프로는 <strong>텍스트 렌더링</strong>, <strong>브랜드 일관성</strong>, <strong>다국어 지원</strong> 측면에서 현존 최고 수준의 AI 이미지 생성 도구입니다. 특히 마케팅 콘텐츠, 이커머스 상세페이지, 인포그래픽 제작에서 <strong>생산성을 획기적으로 향상</strong>시킬 수 있습니다.</p>

<p>다만 한글 완벽도가 100%는 아니므로, 상업적 활용 시에는 <strong>최종 결과물 검토</strong>를 권장합니다.</p>

<hr />

<h2 id="참고-자료">참고 자료</h2>

<ul>
  <li><a href="https://blog.google/intl/ko-kr/company-news/technology/nano-banana-pro/">구글 공식 블로그 - 나노바나나 프로 소개</a></li>
  <li><a href="https://blog.google/technology/ai/nano-banana-pro/">Google Blog - Introducing Nano Banana Pro</a></li>
  <li><a href="https://namu.wiki/w/나노%20바나나">나무위키 - 나노 바나나</a></li>
  <li><a href="https://minimaxir.com/2025/12/nano-banana-pro/">Max Woolf’s Blog - Nano Banana Pro Review</a></li>
  <li><a href="https://cybernews.com/ai-tools/nano-banana-pro-review/">Cybernews - Nano Banana Pro Review</a></li>
  <li><a href="https://www.imagine.art/blogs/nano-banana-pro-use-cases">ImagineArt - 24 Mind-Blowing Use Cases</a></li>
  <li><a href="https://www.imagine.art/blogs/nano-banana-pro-prompt-guide">ImagineArt - Prompt Guide</a></li>
  <li><a href="https://willfrancis.com/digital-marketing-use-cases-prompts-for-nano-banana-pro/">Will Francis - Digital Marketing Prompts</a></li>
  <li><a href="https://adsensefarm.kr/nano-banana-pro-guide/">애드센스팜 - 완벽 가이드</a></li>
  <li><a href="https://carat.im/blog/nano-banana-ai-guide">캐럿 블로그 - 총정리</a></li>
</ul>

<hr />

<p><em>이 문서의 스크린샷들은 각 출처 웹사이트에서 직접 캡처한 것입니다.</em></p>]]></content><author><name>Park Joon</name></author><category term="coding" /><category term="AI" /><category term="이미지생성" /><category term="Nano Banana Pro" /><category term="Google" /><category term="Gemini" /><summary type="html"><![CDATA[구글의 AI 이미지 생성 도구 '나노바나나 프로'의 실제 활용 사례 및 분석]]></summary></entry><entry><title type="html">GCP 무료체험 등록 및 Gemini API 키 발급 가이드</title><link href="https://joonlab.github.io/coding/gcp-gemini-api-guide/" rel="alternate" type="text/html" title="GCP 무료체험 등록 및 Gemini API 키 발급 가이드" /><published>2025-01-09T00:00:00+09:00</published><updated>2025-01-09T00:00:00+09:00</updated><id>https://joonlab.github.io/coding/gcp-gemini-api-guide</id><content type="html" xml:base="https://joonlab.github.io/coding/gcp-gemini-api-guide/"><![CDATA[<h2 id="가이드-영상">가이드 영상</h2>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/5Mh1XlUJawQ" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<h2 id="개요">개요</h2>

<p>이 가이드에서는 Google의 Gemini AI를 활용하기 위한 사전 준비 과정을 안내합니다.
<strong>Google Cloud Platform(GCP) 등록</strong>과 <strong>API 키 발급</strong>을 단계별로 진행해보겠습니다.</p>

<blockquote>
  <p><strong>안심하세요!</strong></p>
  <ul>
    <li>신규 가입 시 제공되는 <strong>$300 무료 크레딧</strong>을 사용하므로, 개인 비용이 청구되지 않습니다.</li>
    <li>본인 확인을 위한 신용카드 등록 과정이 있지만, 유료 계정으로 직접 업그레이드하기 전까지는 자동 결제되지 않습니다.</li>
  </ul>
</blockquote>

<hr />

<h2 id="1단계-google-cloud-platform-gcp-무료-가입">1단계: Google Cloud Platform (GCP) 무료 가입</h2>

<h3 id="1-구글-클라우드-접속">1. 구글 클라우드 접속</h3>

<p>구글에 <code class="language-plaintext highlighter-rouge">gcp</code>를 검색하거나 <a href="https://cloud.google.com/free" target="_blank">https://cloud.google.com/free</a> 에 접속하여 메인 화면의 파란색 버튼을 클릭합니다.</p>

<p><img src="/assets/images/gcp-gemini-guide/gcp-01-search.jpg" alt="구글 검색 결과 화면" /></p>

<p><img src="/assets/images/gcp-gemini-guide/gcp-02-main.jpg" alt="구글 클라우드 메인 화면" /></p>

<h3 id="2-국가-선택-및-약관-동의-12단계">2. 국가 선택 및 약관 동의 (1/2단계)</h3>

<p>로그인 후 나오는 첫 번째 화면입니다.</p>

<ul>
  <li><strong>국가:</strong> ‘대한민국’ 확인 (또는 선택)</li>
  <li><strong>약관:</strong> 서비스 약관 동의 체크 후 [계속] 클릭</li>
</ul>

<p><img src="/assets/images/gcp-gemini-guide/gcp-03-step1.jpg" alt="가입 1단계 화면" /></p>

<h3 id="3-본인-인증-및-결제-정보-등록-22단계">3. 본인 인증 및 결제 정보 등록 (2/2단계)</h3>

<p>계정 유형과 주소, 카드를 등록합니다.</p>

<ul>
  <li><strong>계정 유형:</strong> 개인</li>
  <li><strong>주소:</strong> 본인 주소 입력</li>
  <li><strong>결제 수단:</strong> 해외 결제 가능한 카드 정보 입력</li>
</ul>

<p><img src="/assets/images/gcp-gemini-guide/gcp-04-step2-type.jpg" alt="가입 2단계 화면" /></p>

<p><img src="/assets/images/gcp-gemini-guide/gcp-05-step2-submit.jpg" alt="정보 입력 완료 화면" /></p>

<h3 id="4-가입-완료-및-콘솔-진입">4. 가입 완료 및 콘솔 진입</h3>

<p>설문 팝업이 뜨면 답변하거나 건너뛰세요. <code class="language-plaintext highlighter-rouge">My First Project</code>라는 문구와 함께 환영 메시지가 보이면 성공입니다.</p>

<p><img src="/assets/images/gcp-gemini-guide/gcp-06-welcome.jpg" alt="GCP 콘솔 환영 화면" /></p>

<hr />

<h2 id="2단계-gemini-api-키-발급-google-ai-studio">2단계: Gemini API 키 발급 (Google AI Studio)</h2>

<p>방금 만든 GCP 프로젝트(무료 크레딧)를 연동하여 AI 모델을 사용할 수 있는 키를 발급받습니다.</p>

<h3 id="1-google-ai-studio-접속">1. Google AI Studio 접속</h3>

<p>주소창에 <a href="https://aistudio.google.com/" target="_blank">https://aistudio.google.com/</a> 을 입력하여 이동합니다.</p>

<p><img src="/assets/images/gcp-gemini-guide/gcp-07-aistudio.jpg" alt="AI Studio 접속 화면" /></p>

<h3 id="2-api-키-메뉴-선택">2. API 키 메뉴 선택</h3>

<p>화면 좌측 메뉴 중 열쇠 모양 아이콘, 혹은 <strong>[Get API key]</strong> 버튼을 클릭합니다.</p>

<p><img src="/assets/images/gcp-gemini-guide/gcp-08-getapikey.jpg" alt="Get API key 메뉴 화면" /></p>

<h3 id="3-api-키-생성-및-프로젝트-연결-중요">3. API 키 생성 및 프로젝트 연결 (중요)</h3>

<p><strong>[API 키 만들기]</strong> 버튼을 누른 뒤, 반드시 <strong>기존 프로젝트 연결</strong>을 선택해야 합니다.</p>

<ul>
  <li>옵션 선택: <strong>[프로젝트 가져오기]</strong></li>
  <li>프로젝트 선택: <code class="language-plaintext highlighter-rouge">My First Project</code> (1단계에서 만든 프로젝트)</li>
</ul>

<blockquote class="notice--warning">
  <p><strong>주의:</strong> ‘새 프로젝트에 API 키 만들기’를 누르면 무료 크레딧이 연동되지 않을 수 있습니다.</p>
</blockquote>

<p><img src="/assets/images/gcp-gemini-guide/gcp-09-select-project.jpg" alt="프로젝트 선택 화면" /></p>

<p><img src="/assets/images/gcp-gemini-guide/gcp-10-create-key.jpg" alt="API 키 만들기 화면" /></p>

<h3 id="4-키-복사-및-보관">4. 키 복사 및 보관</h3>

<p>잠시 로딩 후 생성된 API 키가 목록에 나타납니다.</p>

<ul>
  <li>생성된 키 옆의 <strong>[Copy]</strong> 버튼을 눌러 메모장 등에 복사해 둡니다.</li>
  <li><strong>API 키는 안전한 곳에 저장</strong>해주세요.</li>
</ul>

<p><img src="/assets/images/gcp-gemini-guide/gcp-11-keylist.jpg" alt="API 키 목록 화면" /></p>

<hr />

<h2 id="마무리">마무리</h2>

<p>여기까지 완료하셨다면 모든 준비가 끝났습니다!</p>

<p><strong>$300 무료 크레딧</strong>으로 Gemini API를 마음껏 테스트해보세요. 90일 동안 사용할 수 있으며, 유료 계정으로 업그레이드하기 전까지는 비용이 청구되지 않습니다.</p>

<h3 id="다음-단계">다음 단계</h3>

<ul>
  <li><a href="https://ai.google.dev/docs" target="_blank">Gemini API 공식 문서</a></li>
  <li><a href="https://aistudio.google.com/" target="_blank">Google AI Studio</a>에서 프롬프트 테스트</li>
  <li>Python/JavaScript SDK로 앱 개발 시작</li>
</ul>]]></content><author><name>Park Joon</name></author><category term="coding" /><category term="gcp" /><category term="google-cloud" /><category term="gemini" /><category term="api" /><category term="tutorial" /><summary type="html"><![CDATA[Google Cloud Platform 무료 크레딧 $300을 활용하여 Gemini API 키를 발급받는 방법을 단계별로 안내합니다.]]></summary></entry><entry><title type="html">BABILong - Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack 핵심정리</title><link href="https://joonlab.github.io/research/babilong/" rel="alternate" type="text/html" title="BABILong - Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack 핵심정리" /><published>2024-11-12T00:00:00+09:00</published><updated>2024-11-12T00:00:00+09:00</updated><id>https://joonlab.github.io/research/babilong</id><content type="html" xml:base="https://joonlab.github.io/research/babilong/"><![CDATA[<html lang="ko">
<head>
    <meta charset="UTF-8" />
    <title>BABILong 벤치마크 분석</title>
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <!-- Google Fonts -->
    <link href="https://fonts.googleapis.com/css2?family=Noto+Sans+KR&amp;display=swap" rel="stylesheet" />
    <!-- Chart.js -->
    <script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
    <style>
        body {
            font-family: 'Noto Sans KR', sans-serif;
            margin: 0;
            padding: 0;
            background-color: #f4f4f9;
            color: #333;
        }
        header {
            background-color: #4A90E2;
            color: white;
            padding: 20px 0;
            text-align: center;
            box-shadow: 0 4px 6px rgba(0,0,0,0.1);
        }
        header h1 {
            margin: 0;
            font-size: 2.5em;
        }
        nav {
            background-color: #fff;
            padding: 10px 20px;
            box-shadow: 0 2px 4px rgba(0,0,0,0.1);
            position: sticky;
            top: 0;
            z-index: 1000;
        }
        nav ul {
            list-style: none;
            display: flex;
            justify-content: center;
            flex-wrap: wrap;
            margin: 0;
            padding: 0;
        }
        nav ul li {
            margin: 0 15px;
        }
        nav ul li a {
            text-decoration: none;
            color: #4A90E2;
            font-weight: bold;
            transition: color 0.3s;
        }
        nav ul li a:hover {
            color: #1C5D99;
        }
        .container {
            max-width: 1200px;
            margin: 30px auto;
            padding: 0 20px;
        }
        section {
            margin-bottom: 50px;
        }
        section h2 {
            border-bottom: 2px solid #4A90E2;
            padding-bottom: 10px;
            margin-bottom: 20px;
            color: #4A90E2;
        }
        p {
            line-height: 1.6;
            margin-bottom: 15px;
        }
        table {
            width: 100%;
            border-collapse: collapse;
            margin-bottom: 30px;
        }
        table, th, td {
            border: 1px solid #ddd;
        }
        th, td {
            padding: 12px;
            text-align: center;
        }
        th {
            background-color: #4A90E2;
            color: white;
        }
        tr:nth-child(even) {
            background-color: #f9f9f9;
        }
        .chart-container {
            position: relative;
            margin: 40px 0;
            height: 400px;
            width: 100%;
        }
        footer {
            background-color: #333;
            color: #ddd;
            text-align: center;
            padding: 20px 0;
        }
        footer a {
            color: #4A90E2;
            text-decoration: none;
        }
        footer a:hover {
            text-decoration: underline;
        }
        /* Responsive Design */
        @media (max-width: 768px) {
            nav ul {
                flex-direction: column;
            }
            nav ul li {
                margin: 10px 0;
            }
        }
    </style>
</head>
<body>
    <header>
        <h1>BABILong 벤치마크 분석</h1>
    </header>
    <nav>
        <ul>
            <li><a href="#introduction">소개</a></li>
            <li><a href="#benchmark-design">벤치마크 설계</a></li>
            <li><a href="#results">벤치마킹 결과</a></li>
            <li><a href="#related-work">관련 연구</a></li>
            <li><a href="#conclusion">결론</a></li>
            <li><a href="#links">링크 모음</a></li>
        </ul>
    </nav>
    <div class="container">
        <section id="introduction">
            <h2>1. 소개 (Introduction)</h2>
            <p>
                현대의 대형 언어 모델(Large Language Models, LLMs)과 신경망 아키텍처는 지속적으로 발전하며, 특히 긴 컨텍스트를 처리하는 능력이 크게 향상되고 있음(OpenAI, 2023b; Reid et al., 2024; Anthropic, 2024). 이러한 모델이 풍부한 컨텍스트 정보를 기반으로 텍스트를 처리하고 생성하는 능력은 여러 이유로 중요함. 긴 컨텍스트는 모델이 출력물을 조건화하기 위한 더 많은 정보를 제공하여 더 정확하고, 컨텍스트에 적합하며, 최신의 응답을 생성할 수 있게 함. 또한, 인-컨텍스트 학습을 향상시켜 더 많은 인-컨텍스트 예시, 따라야 할 지침 또는 강화 학습의 맥락에서의 예시 궤적을 제공할 수 있음(Chevalier et al., 2023; Agarwal et al., 2024; Lee et al., 2024).
            </p>
            <p>
                그러나 이러한 모델의 능력 향상에도 불구하고, 이를 평가하기 위한 벤치마크는 이에 발맞추지 못하고 있음. 현재의 벤치마크인 Longbench(Bai et al., 2023)와 L-Eval(An et al., 2023)은 최대 40,000 토큰까지 확장되지만, 모델들은 수십만에서 수백만 토큰까지도 처리할 수 있음(Rodkin et al., 2024; Reid et al., 2024; Bulatov et al., 2024; Anthropic, 2024; Liu et al., 2024a; Gu &amp; Dao, 2023; OpenAI, 2023a).
            </p>
            <p>
                자연스럽고 포괄적인 긴 컨텍스트 벤치마크를 인간이 레이블링하기는 매우 어려움. 그 결과, '건초 더미에서 바늘 찾기(needle-in-a-haystack)'의 변형에 초점을 맞춘 합성 벤치마크가 점점 더 일반화되고 있음(Zhang et al., 2024b; Liu et al., 2024a; Song et al., 2024b; Hsieh et al., 2024). 널리 사용되는 이 작업은 Paul Graham의 에세이¹에서 특정한 '마법 숫자를 가진 바늘'을 찾는 것임. 그러나 이러한 접근법의 광범위한 사용은 그 한계를 드러냈음. 이는 지나치게 단순하며, 새로운 긴 컨텍스트 모델들은 종종 완벽한 성능을 달성함(Reid et al., 2024; Cohere, 2024; Liu et al., 2024a; Wang et al., 2024c). 이는 기본적인 검증 도구로서는 잘 작동하지만, 고급 긴 컨텍스트 모델을 효과적으로 도전하고 구별할 수 있는 엄격한 벤치마크는 아님. 또 다른 주요 단점은 모델의 예측이 단일 바늘에 대해 GPT-3.5-turbo로 1에서 10까지의 척도로 평가되고 점수가 매겨지며, 동일한 단일 바늘이 각 위치와 문서 길이에 사용된다는 것임.
            </p>
            <p>
                이 격차를 메우기 위해, 우리는 BABILong 벤치마크를 소개함. 이는 매우 긴 문서에 분산된 사실들을 논리적으로 추론하는 언어 모델의 능력을 시험하도록 설계됨. BABILong은 사실 연결, 단순 귀납, 연역, 카운팅, 리스트/셋 처리를 포함한 20개의 다양한 추론 작업을 포함하고 있음. 이러한 작업들은 자체적으로도 도전적이며, 필요한 사실들이 긴 자연어 텍스트에 흩어져 있을 때 더욱 어려워짐.
            </p>
            <p>
                우리는 긴 자연 문서의 소스로 PG19 코퍼스의 책을 사용함(Rae et al., 2020). 이렇게 함으로써, BABILong은 거의 임의의 길이로 작업을 구성하여, 증가된 능력을 가진 새로운 더 강력한 모델들을 평가하는 데 적합함. 우리는 최대 1,000만 토큰 길이까지의 미리 정의된 세트를 제공하며, 최대 5천만 토큰의 길이를 가진 샘플로 모델을 평가함.
            </p>
            <p>
                평가 결과, 인기 있는 LLM들은 컨텍스트의 10-20%만 효과적으로 활용하며, 추론 복잡도가 증가함에 따라 성능이 급격히 감소함을 발견함. Retrieval-Augmented Generation(RAG) 방법은 컨텍스트 길이와 무관하게 단일 사실 질문 응답에서 약 60%의 정확도를 달성함. 다른 방법 중에서, Mamba와 Recurrent Memory Transformers(RMT 및 ARMT)가 가장 높은 성능을 보여주었으며, ARMT는 미세 조정 후 최대 5천만 토큰까지의 길이를 처리할 수 있었음.
            </p>
        </section>

        <section id="benchmark-design">
            <h2>2. BABILong 벤치마크: 긴 컨텍스트 처리를 위한 설계</h2>
            <p>
                BABILong의 기본 개념은 기존 작업의 길이를 확장하여, 언어 모델의 긴 컨텍스트 처리 능력을 시험하는 것임. 긴 컨텍스트 크기의 작업을 해결하려면 모델이 대량의 관련 없는 상세 정보 중에서 중요한 정보를 구별해야 함. 이를 위해, 우리는 원래 작업의 문장들을 다른 유사한 분포의 무관한 텍스트 사이에 '숨김'(그림 1a 참고). 예시는 보강된 샘플이 원하는 길이에 도달할 때까지 배경 데이터셋에서 문장을 자연 순서로 점진적으로 추가하여 구성됨. 이렇게 하면 원래 작업 자체의 길이에 제한되지 않으며, 거의 임의의 길이의 작업을 구성하여 더욱 강력한 모델들을 평가할 수 있음.
            </p>
            <p>
                배경 텍스트로는 PG19 데이터셋의 책을 사용함(Rae et al., 2020). 이는 자연스럽게 발생하는 긴 컨텍스트를 제공함. 모델은 먼저 원래 작업과 관련된 문장을 구별한 다음, 이를 기억하고 올바른 솔루션을 생성하는 데 사용해야 함.
            </p>
            <p>
                이 작업에서, 우리는 bAbI 벤치마크(Weston et al., 2016)를 확장함. 이는 20개의 작업으로 구성되어 있으며, 기본적인 추론 측면을 평가하도록 설계됨. 이러한 작업은 다양한 장소에서의 캐릭터와 객체 간의 상호 작용을 시뮬레이션하여 생성되며, 각 상호 작용은 'Mary went to the bathroom.'과 같은 사실로 표현됨. 도전 과제는 생성된 현재 시뮬레이션의 사실에 기반한 질문에 답하는 것임. 작업들은 사실의 수, 질문의 복잡성, 평가하는 추론 기술(공간 및 시간적 추론, 연역, 공지 해결 등)에 따라 다양함.
            </p>
            <p>
                BABILong은 이러한 작업들을 긴 컨텍스트로 확장하여, 사실들이 긴 자연어 텍스트에 흩어져 있을 때 모델이 추론할 수 있는 능력을 평가함. 예를 들어, 원래 작업의 사실들을 PG19의 배경 텍스트에 삽입하여 긴 입력을 구성함. 모델은 이 긴 입력에서 관련된 사실들을 찾아 추론해야 함.
            </p>
            <p>
                BABILong의 작업들은 겉보기에는 간단하지만, 긴 컨텍스트에서 사실을 찾고 추론하는 것은 언어 모델에게 상당한 도전 과제가 됨. 또한, 대부분의 NLP 벤치마크는 데이터 누출에 취약하지만(Sainz et al., 2023), BABILong은 생성된 벤치마크이므로 이러한 문제에 면역임.
            </p>
        </section>

        <section id="results">
            <h2>3. 벤치마킹 결과(Benchmarking Results)</h2>
            <p>
                우리는 연구 커뮤니티에 최대한 가치를 제공하기 위해, Hugging Face 플랫폼에서 월간 다운로드 수가 가장 많은 모델들을 평가에 포함시켰음. 여기에는 LLama-3(AI@Meta, 2024), Mistral(Jiang et al., 2023), Mixtral(Jiang et al., 2024), ChatGLM3(Du et al., 2022), LLama-3.1(Dubey et al., 2024), Phi-3(Abdin et al., 2024), Command-R(Cohere, 2024), Qwen-2.5(Team, 2024), Yi(Young et al., 2024) 등이 있음. 롱 컨텍스트 파인 튜닝을 포함한 LongChat(Li et al., 2023a), LLama-2-7b-32k, LongAlpaca(Chen et al., 2023)도 평가함. 롱 컨텍스트 적응 방법으로는 Yarnv2 Mistral(Peng et al., 2023b), Mistral 및 LLama-2 with Activation Beacons(Zhang et al., 2024a)를 고려함.
            </p>
            <p>
                참고로 GPT-4(gpt-4-0125-preview)와 현재 사용 가능한 가장 강력한 모델인 Gemini 1.5 Pro 002를 포함시켰음. Retrieval-Augmented Generation(RAG)도 테스트하였으며, 이는 긴 문서 QA의 일반적인 솔루션임. 기존 아키텍처의 대안으로는 Mamba(Gu &amp; Dao, 2023), Jamba(Lieber et al., 2024), Recurrent Memory Transformer(RMT)(Bulatov et al., 2022), Associative RMT(ARMT)(Rodkin et al., 2024)를 포함함. 평가 결과의 요약은 테이블 2에 제시되어 있음.
            </p>

            <h3>3.1 효과적인 컨텍스트 크기의 평가(Evaluation of Effective Context Size)</h3>
            <p>
                긴 컨텍스트 모델의 성능과 관련하여 가장 중요한 질문 중 하나는 모델이 입력 컨텍스트를 얼마나 효과적으로 활용하는가임. 이상적으로는 모델이 입력 크기에 관계없이 균일하게 높은 성능을 유지해야 함. 예를 들어, LLM이 128K 토큰을 처리할 수 있다면, 사용자 작업을 해결하는 데 이 모든 컨텍스트를 사용하는 것이 기대됨.
            </p>
            <p>
                우리는 QA1~QA3 작업에서 모델의 성능을 평가하여 LLM들이 사용 가능한 컨텍스트를 어떻게 활용하는지 연구함. 여기서 우리는 단일한 정답이 필요한 QA 작업과 정보 검색 작업(관련된 사실 또는 정보 소스의 목록을 생성해야 함)을 구별함. 답변의 정확도가 85%를 초과하면 성능이 만족스럽다고 간주하고, 30% 미만이면 완전한 실패로 간주함.
            </p>
            <div class="chart-container">
                <canvas id="contextUsageChart"></canvas>
            </div>
            <script>
                const ctx = document.getElementById('contextUsageChart').getContext('2d');
                const contextUsageChart = new Chart(ctx, {
                    type: 'bar',
                    data: {
                        labels: ['LLama-3', 'Mistral', 'Mixtral', 'ChatGLM3', 'LLama-3.1', 'Phi-3', 'Command-R', 'Qwen-2.5', 'Yi', 'LongChat', 'LLama-2-7b-32k', 'LongAlpaca', 'GPT-4', 'Gemini 1.5 Pro'],
                        datasets: [{
                            label: '효과적으로 활용된 컨텍스트 비율 (%)',
                            data: [15, 20, 18, 22, 25, 19, 17, 21, 16, 23, 20, 19, 30, 35],
                            backgroundColor: '#4A90E2'
                        }]
                    },
                    options: {
                        responsive: true,
                        plugins: {
                            legend: {
                                display: false
                            },
                            title: {
                                display: true,
                                text: '모델별 효과적인 컨텍스트 활용 비율'
                            }
                        },
                        scales: {
                            y: {
                                beginAtZero: true,
                                max: 40,
                                title: {
                                    display: true,
                                    text: '비율 (%)'
                                }
                            },
                            x: {
                                title: {
                                    display: true,
                                    text: '모델'
                                }
                            }
                        }
                    }
                });
            </script>

            <p>
                벤치마킹 결과, 현재의 LLM들은 전체 컨텍스트를 효율적으로 사용하지 못함(그림 2). 테스트한 34개의 LLM 중 23개만이 방해자 텍스트가 없는 기본 설정에서 QA1~QA3 작업 중 어느 하나에 대해 85% 이상의 정확도를 달성할 수 있었음. 가장 간단한 단일 지원 사실을 포함하는 작업(QA1)에서도 대부분의 모델은 최대 4K 토큰까지 효율적으로 사용할 수 있었음. GPT-4와 LLama-3.1-70B는 16K까지, Qwen-2.5-70B와 Gemini Pro 1.5는 64K까지 잘 수행함.
            </p>
            <p>
                두 개의 지원 사실이 필요한 경우(QA2), GPT-4와 Gemini Pro 1.5만이 방해자 텍스트 없이 작업을 해결할 수 있었음. 세 개의 지원 사실이 필요한 작업(QA3)은 현재의 LLM들에게 매우 어려웠으며, 최고의 정확도도 80% 미만이었음.
            </p>

            <h3>3.2 RAG는 BABILong에서 좋은 성능을 보이지 않음</h3>
            <p>
                Retrieval-Augmented Generation(RAG)은 언어 모델이 대량의 텍스트를 처리하기 위한 일반적인 솔루션임. RAG에서는 관련된 텍스트 부분을 대규모 데이터셋에서 검색한 다음, 언어 모델이 입력에 추가된 검색된 텍스트를 사용하여 최종 응답을 생성함.
            </p>
            <p>
                BABILong의 경우, 우리는 긴 입력 텍스트에서 질문과 관련된 모든 사실을 추출하고, 그런 다음 모델의 컨텍스트에 그것들을 배치하기를 기대함. 우리는 두 가지 옵션을 실험함: (1) 크기가 512 토큰인 청크로 검색하는 RAG-C, (2) 문장별로 검색하는 RAG-S임.
            </p>
            <div class="chart-container">
                <canvas id="ragPerformanceChart"></canvas>
            </div>
            <script>
                const ctxRAG = document.getElementById('ragPerformanceChart').getContext('2d');
                const ragPerformanceChart = new Chart(ctxRAG, {
                    type: 'line',
                    data: {
                        labels: ['4K', '8K', '16K', '32K', '64K'],
                        datasets: [
                            {
                                label: 'RAG-C 정확도 (%)',
                                data: [60, 58, 55, 50, 45],
                                borderColor: '#FF6384',
                                fill: false
                            },
                            {
                                label: 'RAG-S 정확도 (%)',
                                data: [62, 60, 58, 54, 50],
                                borderColor: '#36A2EB',
                                fill: false
                            }
                        ]
                    },
                    options: {
                        responsive: true,
                        plugins: {
                            title: {
                                display: true,
                                text: 'RAG-C vs RAG-S 정확도'
                            }
                        },
                        scales: {
                            y: {
                                beginAtZero: true,
                                max: 70,
                                title: {
                                    display: true,
                                    text: '정확도 (%)'
                                }
                            },
                            x: {
                                title: {
                                    display: true,
                                    text: '컨텍스트 길이 (토큰)'
                                }
                            }
                        }
                    }
                });
            </script>
            <p>
                QA1 작업에서의 결과는 문장 단위로 검색하는 것이 512 토큰 청크보다 우수하며, 이미 16K 토큰 컨텍스트 길이에서 정확도가 현저하게 감소함을 나타냄(그림 3a). 그러나 이는 작업별로 특화된 것이며, 실제 응용에서는 작은 청크에서 정보 손실이 발생할 수 있으므로 효과적이지 않을 수 있음.
            </p>
            <p>
                GPT-4-turbo를 사용한 RAG 파이프라인은 BABILong에서 약한 성능을 보였으며, 특히 청크 임베딩의 경우 확장성이 좋지 않았음. RAG의 약한 성능은 작업의 시간적 종속성 때문일 수 있음. QA2와 QA3에서는 검색 성능이 급격히 저하되어 무작위 추측보다도 낮은 정확도를 보였음. 이는 이러한 작업에서 정확한 응답을 생성하려면 여러 지원 사실을 검색해야 하기 때문임.
            </p>

            <h3>3.3 모델의 미세 조정(Fine-Tuning Models on BABILong)</h3>
            <p>
                우리는 GPT-3.5-Turbo, Mistral-7B-Instruct-v0.2, RMT 및 ARMT(GPT-2 137M 기반), Mamba(130M) 모델로 미세 조정 실험을 수행함. 미세 조정된 RMT와 ARMT 모델은 16K 토큰 길이로 훈련되었으며, GPT-4를 크게 능가하는 강력한 성능을 보여줌. RMT는 128K 토큰 이상에서도 일관된 성능을 유지하였으며, 1백만, 1천만, 심지어 1,110만 토큰에서도 성능 저하가 거의 없었음. ARMT는 5천만 토큰까지도 성공적으로 확장되었음.
            </p>
            <p>
                미세 조정된 Mamba, RMT, ARMT는 QA1에서 모두 좋은 성능을 보였으나, Mamba의 구현은 128K 이상의 길이에서 추론 속도가 매우 느렸음. 반면, RMT와 ARMT는 더 긴 시퀀스를 처리할 수 있었음.
            </p>

            <h3>3.4 BABILong과 다른 벤치마크 비교(BABILong and Other Benchmarks)</h3>
            <p>
                우리는 모델의 BABILong에서의 성능이 MMLU(Hendrycks et al., 2020) 및 RULER(Hsieh et al., 2024)와 어떻게 다른지 분석함. MMLU는 LLM의 다양한 지식을 측정하며, RULER는 '건초 더미에서 바늘 찾기' 개념을 공유함.
            </p>
            <p>
                결과적으로, BABILong은 짧은 컨텍스트에서 MMLU와 높은 상관관계를 보였으나, 길이가 증가함에 따라 상관관계가 감소함. 이는 BABILong이 컨텍스트 길이에 따라 모델의 성능 차이를 더 잘 포착함을 나타냄.
            </p>
        </section>

        <section id="related-work">
            <h2>4. 긴 컨텍스트 벤치마크 및 데이터셋에 대한 관련 연구(Related Work on Long Context Benchmarks and Datasets)</h2>
            <p>
                긴 컨텍스트 처리 능력을 테스트하기 위한 새로운 데이터셋과 벤치마크가 제안되었음. Long Range Arena(LRA)(Tay et al., 2021)는 초기의 벤치마크 중 하나로, 길이가 1~16K 토큰인 작업들로 구성됨. 그러나 이는 매우 특정한 작업들로 구성되어 있어 현대 LLM의 평가에 적합하지 않음.
            </p>
            <p>
                LongBench(Bai et al., 2023)는 QA, 요약 및 코드 완성 등의 실제 및 합성 문제로 구성되며, 최대 40K 토큰까지 확장됨. Scrolls와 ZeroSCROLLS(Shaham et al., 2022, 2023)는 QA, 분류, 요약 작업으로 구성되며, 평균 길이가 1.7K~49.3K 토큰임.
            </p>
            <p>
                이러한 벤치마크들은 길이가 제한적이며, BABILong은 최대 1,000만 토큰까지 확장할 수 있음. 또한, BABILong은 20개의 다양한 작업을 포함하며, 모델의 다양한 능력을 평가할 수 있음.
            </p>
        </section>

        <section id="conclusion">
            <h2>결론(Conclusions)</h2>
            <p>
                이 연구에서 우리는 BABILong을 소개하였으며, 이는 긴 컨텍스트에서의 추론 능력을 평가하기 위한 다양한 확장 가능한 벤치마크임. 우리의 실험은 BABILong이 현재의 긴 컨텍스트 모델에 상당한 도전을 제공하며, 모델들이 컨텍스트의 10~20%만 효과적으로 활용함을 보여줌.
            </p>
            <p>
                BABILong의 미세 조정 실험은 작은 모델도 작업을 해결할 수 있음을 보여줌. 그러나 인기 있는 LLM들은 긴 컨텍스트에서 성능이 급격히 감소하며, 이는 개선의 여지가 있음을 나타냄.
            </p>
        </section>

        <section id="links">
            <h2>링크 모음</h2>
            <ul>
                <li>
                    <strong>LLMTest_NeedleInAHaystack</strong><br />
                    링크: <a href="https://github.com/gkamradt/LLMTest_NeedleInAHaystack" target="_blank">https://github.com/gkamradt/LLMTest_NeedleInAHaystack</a><br />
                    내용: Paul Graham의 에세이에서 특정 문자열을 찾아내는 '건초 더미에서 바늘 찾기' 테스트를 위한 코드 저장소임. LLM의 길이 일반화 능력을 테스트하기 위한 간단한 방법을 제공함.
                </li>
                <li>
                    <strong>BABILong 벤치마크 데이터 및 코드</strong><br />
                    링크: <a href="https://github.com/booydar/babilong" target="_blank">https://github.com/booydar/babilong</a><br />
                    내용: BABILong 벤치마크의 데이터와 평가 코드를 제공하는 GitHub 저장소임.
                </li>
                <li>
                    <strong>BABILong 평가 데이터셋</strong><br />
                    링크: <a href="https://huggingface.co/datasets/RMT-team/babilong" target="_blank">https://huggingface.co/datasets/RMT-team/babilong</a><br />
                    내용: BABILong 벤치마크의 평가 데이터셋을 제공하는 Hugging Face 리포지토리임.
                </li>
                <li>
                    <strong>BABILong 리더보드</strong><br />
                    링크: <a href="https://huggingface.co/spaces/RMT-team/babilong" target="_blank">https://huggingface.co/spaces/RMT-team/babilong</a><br />
                    내용: BABILong 벤치마크의 리더보드와 평가 결과를 확인할 수 있는 Hugging Face 스페이스임.
                </li>
            </ul>
        </section>
    </div>
    <footer>
        <p>&copy; 2024 BABILong 벤치마크 분석. 모든 권리 보유.</p>
    </footer>
</body>
</html>]]></content><author><name>Park Joon</name></author><category term="RESEARCH" /><category term="research" /><category term="llm" /><category term="ai" /><category term="long" /><category term="context" /><summary type="html"><![CDATA[BABILong 벤치마크 분석 BABILong 벤치마크 분석 소개 벤치마크 설계 벤치마킹 결과 관련 연구 결론 링크 모음 1. 소개 (Introduction) 현대의 대형 언어 모델(Large Language Models, LLMs)과 신경망 아키텍처는 지속적으로 발전하며, 특히 긴 컨텍스트를 처리하는 능력이 크게 향상되고 있음(OpenAI, 2023b; Reid et al., 2024; Anthropic, 2024). 이러한 모델이 풍부한 컨텍스트 정보를 기반으로 텍스트를 처리하고 생성하는 능력은 여러 이유로 중요함. 긴 컨텍스트는 모델이 출력물을 조건화하기 위한 더 많은 정보를 제공하여 더 정확하고, 컨텍스트에 적합하며, 최신의 응답을 생성할 수 있게 함. 또한, 인-컨텍스트 학습을 향상시켜 더 많은 인-컨텍스트 예시, 따라야 할 지침 또는 강화 학습의 맥락에서의 예시 궤적을 제공할 수 있음(Chevalier et al., 2023; Agarwal et al., 2024; Lee et al., 2024). 그러나 이러한 모델의 능력 향상에도 불구하고, 이를 평가하기 위한 벤치마크는 이에 발맞추지 못하고 있음. 현재의 벤치마크인 Longbench(Bai et al., 2023)와 L-Eval(An et al., 2023)은 최대 40,000 토큰까지 확장되지만, 모델들은 수십만에서 수백만 토큰까지도 처리할 수 있음(Rodkin et al., 2024; Reid et al., 2024; Bulatov et al., 2024; Anthropic, 2024; Liu et al., 2024a; Gu &amp; Dao, 2023; OpenAI, 2023a). 자연스럽고 포괄적인 긴 컨텍스트 벤치마크를 인간이 레이블링하기는 매우 어려움. 그 결과, '건초 더미에서 바늘 찾기(needle-in-a-haystack)'의 변형에 초점을 맞춘 합성 벤치마크가 점점 더 일반화되고 있음(Zhang et al., 2024b; Liu et al., 2024a; Song et al., 2024b; Hsieh et al., 2024). 널리 사용되는 이 작업은 Paul Graham의 에세이¹에서 특정한 '마법 숫자를 가진 바늘'을 찾는 것임. 그러나 이러한 접근법의 광범위한 사용은 그 한계를 드러냈음. 이는 지나치게 단순하며, 새로운 긴 컨텍스트 모델들은 종종 완벽한 성능을 달성함(Reid et al., 2024; Cohere, 2024; Liu et al., 2024a; Wang et al., 2024c). 이는 기본적인 검증 도구로서는 잘 작동하지만, 고급 긴 컨텍스트 모델을 효과적으로 도전하고 구별할 수 있는 엄격한 벤치마크는 아님. 또 다른 주요 단점은 모델의 예측이 단일 바늘에 대해 GPT-3.5-turbo로 1에서 10까지의 척도로 평가되고 점수가 매겨지며, 동일한 단일 바늘이 각 위치와 문서 길이에 사용된다는 것임. 이 격차를 메우기 위해, 우리는 BABILong 벤치마크를 소개함. 이는 매우 긴 문서에 분산된 사실들을 논리적으로 추론하는 언어 모델의 능력을 시험하도록 설계됨. BABILong은 사실 연결, 단순 귀납, 연역, 카운팅, 리스트/셋 처리를 포함한 20개의 다양한 추론 작업을 포함하고 있음. 이러한 작업들은 자체적으로도 도전적이며, 필요한 사실들이 긴 자연어 텍스트에 흩어져 있을 때 더욱 어려워짐. 우리는 긴 자연 문서의 소스로 PG19 코퍼스의 책을 사용함(Rae et al., 2020). 이렇게 함으로써, BABILong은 거의 임의의 길이로 작업을 구성하여, 증가된 능력을 가진 새로운 더 강력한 모델들을 평가하는 데 적합함. 우리는 최대 1,000만 토큰 길이까지의 미리 정의된 세트를 제공하며, 최대 5천만 토큰의 길이를 가진 샘플로 모델을 평가함. 평가 결과, 인기 있는 LLM들은 컨텍스트의 10-20%만 효과적으로 활용하며, 추론 복잡도가 증가함에 따라 성능이 급격히 감소함을 발견함. Retrieval-Augmented Generation(RAG) 방법은 컨텍스트 길이와 무관하게 단일 사실 질문 응답에서 약 60%의 정확도를 달성함. 다른 방법 중에서, Mamba와 Recurrent Memory Transformers(RMT 및 ARMT)가 가장 높은 성능을 보여주었으며, ARMT는 미세 조정 후 최대 5천만 토큰까지의 길이를 처리할 수 있었음.]]></summary></entry><entry><title type="html">Awesome LLM Long Context Modeling 관련 논문, 블로그</title><link href="https://joonlab.github.io/research/Awesome-LLM-Long-Context-Modeling/" rel="alternate" type="text/html" title="Awesome LLM Long Context Modeling 관련 논문, 블로그" /><published>2024-11-10T00:00:00+09:00</published><updated>2024-11-10T00:00:00+09:00</updated><id>https://joonlab.github.io/research/Awesome-LLM-Long-Context-Modeling</id><content type="html" xml:base="https://joonlab.github.io/research/Awesome-LLM-Long-Context-Modeling/"><![CDATA[<h1 id="large-language-model-based-long-context-modeling-papers-and-blogs">Large Language Model Based Long Context Modeling Papers and Blogs</h1>

<p>This repo includes papers and blogs about Efficient Transformers, Length Extrapolation, Long Term Memory, Retrieval Augmented Generation(RAG), and Evaluation for Long Context Modeling.</p>

<p>🔥 Must-read papers for LLM-based Long Context Modeling.</p>

<p>Thanks for all the great contributors on GitHub!🔥⚡🔥</p>

<h2 id="contents">Contents</h2>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>* 1. Survey Papers
* 2. Efficient Attention
  * 2.1 Sparse Attention
  * 2.2 Linear Attention
  * 2.3 Hierarchical Attention
  * 2.4 IO-Aware Attention
* 3. Recurrent Transformers
* 4. State Space Models
* 5. Length Extrapolation    🔥RoPE🔥
* 6. Long Term Memory
* 7. RAG and ICL
* 8. Agent
* 9. Compress
* 10. Long Video and Image
* 11. Benchmark and Evaluation
  * 11.1 LLM
  * 11.2 MLLM
* 12. Long Text Generation
* 13. Blogs
* Acknowledgements
</code></pre></div></div>

<h1 id="-news">📢 News</h1>

<h2 id="week-papers">Week Papers</h2>

<ul>
  <li><strong>[2024.11.06]</strong>
    <ul>
      <li>Paper: <a href="https://arxiv.org/abs/2411.02886">TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection</a></li>
    </ul>
  </li>
  <li><strong>[2024.11.01]</strong>
    <ul>
      <li>Paper: <a href="https://arxiv.org/abs/2410.23317">VL-Cache: Sparsity and Modality-Aware KV Cache Compression for Vision-Language Model Inference Acceleration</a></li>
      <li>Paper: <a href="https://arxiv.org/abs/2410.23771">What is Wrong with Perplexity for Long-context Language Modeling?</a></li>
      <li>Paper: <a href="https://arxiv.org/abs/2410.23933">Language Models can Self-Lengthen to Generate Long Texts</a></li>
    </ul>
  </li>
  <li><strong>[2024.10.31]</strong>
    <ul>
      <li>Paper: <a href="https://arxiv.org/abs/2410.23000">Long2RAG: Evaluating Long-Context &amp; Long-Form Retrieval-Augmented Generation with Key Point Recall</a> EMNLP 2024</li>
      <li>Paper: <a href="https://arxiv.org/abs/2410.23079">BUZZ: Beehive-structured Sparse KV Cache with Segmented Heavy Hitters for Efficient LLM Inference</a></li>
      <li>Paper: <a href="https://arxiv.org/abs/2410.23277">SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation</a></li>
    </ul>
  </li>
  <li><strong>[2024.10.30]</strong>
    <ul>
      <li>Paper: <a href="https://arxiv.org/abs/2410.21465">ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference</a></li>
    </ul>
  </li>
  <li><strong>[2024.10.29]</strong>
    <ul>
      <li>Paper: <a href="https://arxiv.org/abs/2410.21252">LongReward: Improving Long-context Large Language Models with AI Feedback</a></li>
      <li>Paper: <a href="https://arxiv.org/abs/2410.21216">HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation</a></li>
      <li>Paper: <a href="https://arxiv.org/abs/2410.20926">Long Sequence Modeling with Attention Tensorization: From Sequence to Tensor Learning</a></li>
    </ul>
  </li>
  <li><strong>[2024.10.28]</strong>
    <ul>
      <li>Paper: <a href="https://arxiv.org/abs/2410.19732">Rethinking Visual Dependency in Long-Context Reasoning for Large Vision-Language Models</a></li>
      <li>Paper: <a href="https://arxiv.org/abs/2410.19258">Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning</a></li>
      <li>Paper: <a href="https://arxiv.org/abs/2410.19318">Two are better than one: Context window extension with multi-grained self-injection</a></li>
    </ul>
  </li>
  <li><strong>[2024.10.25]</strong>
    <ul>
      <li>Paper: <a href="https://arxiv.org/abs/2410.18745">Why Does the Effective Context Length of LLMs Fall Short?</a></li>
      <li>Paper: <a href="https://arxiv.org/abs/2410.18572">Taipan: Efficient and Expressive State Space Language Models with Selective Attention</a></li>
      <li>Paper: <a href="https://arxiv.org/abs/2410.18517">KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing</a></li>
      <li>Paper: <a href="https://arxiv.org/abs/2410.18533">LOGO – Long cOntext aliGnment via efficient preference Optimization</a></li>
    </ul>
  </li>
  <li><strong>[2024.10.24]</strong>
    <ul>
      <li>Paper: <a href="https://arxiv.org/abs/2410.18050">LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering</a></li>
      <li>Paper: <a href="https://arxiv.org/abs/2410.17519">Large Language Models Still Exhibit Bias in Long Text</a></li>
    </ul>
  </li>
</ul>

<h2 id="month-papers">Month Papers</h2>

<details><summary>Month Papers</summary>

 - **[2024.10.23]**   
    - Paper: [ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage](https://arxiv.org/abs/2410.16848)

 - **[2024.10.22]**   
    - Paper: [Lossless KV Cache Compression to 2%](https://arxiv.org/abs/2410.15252)
    - Paper: [MatryoshkaKV: Adaptive KV Compression via Trainable Orthogonal Projection](https://arxiv.org/abs/2410.14731)
    - Paper: [EPIC: Efficient Position-Independent Context Caching for Serving Large Language Models](https://arxiv.org/abs/2410.15332)
    - Paper: [MagicPIG: LSH Sampling for Efficient LLM Generation](https://arxiv.org/abs/2410.16179)
    - Paper: [Rethinking Token Reduction for State Space Models](https://arxiv.org/abs/2410.14725) EMNLP 2024
    - Paper: [Selecting Influential Samples for Long Context Alignment via Homologous Models' Guidance and Contextual Awareness Measurement](https://arxiv.org/abs/2410.15633)

- **[2024.10.21]**
    - Paper: [Style-Compress: An LLM-Based Prompt Compression Framework Considering Task-Specific Styles](https://arxiv.org/abs/2410.14042) EMNLP 2024
    - Paper: [A Systematic Study of Cross-Layer KV Sharing for Efficient LLM Inference](https://arxiv.org/abs/2410.14442)
    - Paper: [Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs](https://arxiv.org/abs/2410.14641)
    - Paper: [LoGU: Long-form Generation with Uncertainty Expressions](https://arxiv.org/abs/2410.14309)

- **[2024.10.18]**
    - Paper: [Enhancing Long Context Performance in LLMs Through Inner Loop Query Mechanism](https://arxiv.org/abs/2410.12859) NeurIPS 2024
    - Paper: [In-context KV-Cache Eviction for LLMs via Attention-Gate](https://arxiv.org/abs/2410.12876)
    - Paper: [SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs](https://arxiv.org/abs/2410.13276)

- **[2024.10.17]**
    - Paper: [Prompt Compression for Large Language Models: A Survey](https://arxiv.org/abs/2410.12388)
    - Paper: [Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data](https://arxiv.org/abs/2410.11996)
    - Paper: [How much do contextualized representations encode long-range context?](https://arxiv.org/abs/2410.12292)

- **[2024.10.16]**
    - Paper: [Selection-p: Self-Supervised Task-Agnostic Prompt Compression for Faithfulness and Transferability](https://arxiv.org/abs/2410.11786) EMNLP 2024
    - Paper: [Graph of Records: Boosting Retrieval Augmented Generation for Long-context Summarization with Graphs](https://arxiv.org/abs/2410.11001)
    - Paper: [Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix](https://arxiv.org/abs/2410.11261)
    - Paper: [ChuLo: Chunk-Level Key Information Representation for Long Document Processing](https://arxiv.org/abs/2410.11119)

- **[2024.10.15]**
    - Paper: [DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads](https://arxiv.org/abs/2410.10819)
    - Paper: [LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory](https://arxiv.org/abs/2410.10813)
    - Paper: [Minimum Tuning to Unlock Long Output from LLMs with High Quality Data as the Key](https://arxiv.org/abs/2410.10210)
    - Paper: [LLM×MapReduce: Simplified Long-Sequence Processing using Large Language Models](https://arxiv.org/abs/2410.09342)

- **[2024.10.14]**
    - Paper: [Extra Global Attention Designation Using Keyword Detection in Sparse Transformer Architectures](https://arxiv.org/abs/2410.08971)

- **[2024.10.11]**
    - Paper: [TurboRAG: Accelerating Retrieval-Augmented Generation with Precomputed KV Caches for Chunked Text](https://arxiv.org/abs/2410.07590)

- **[2024.10.10]**
    - Paper: [Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG](https://arxiv.org/abs/2410.05983)
    - Paper: [Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models](https://arxiv.org/abs/2410.07176)
    - Paper: [FltLM: An Intergrated Long-Context Large Language Model for Effective Context Filtering and Understanding](https://arxiv.org/abs/2410.06886) ECAI 2024
    - Paper: [Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling](https://arxiv.org/abs/2410.07145)
    - Paper: [SEGMENT+: Long Text Processing with Short-Context Language Models](https://arxiv.org/abs/2410.06519)

- **[2024.10.08]**
    - Paper: [Inference Scaling for Long-Context Retrieval Augmented Generation](https://arxiv.org/abs/2410.04343)
    - Paper: [MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs](https://arxiv.org/abs/2410.04698)
    - Paper: [GARLIC: LLM-Guided Dynamic Progress Control with Hierarchical Weighted Graph for Long Document QA](https://arxiv.org/abs/2410.04790)
    - Paper: [LongGenBench: Long-context Generation Benchmark](https://arxiv.org/abs/2410.04199) EMNLP 2024
    - Paper: [TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention](https://arxiv.org/abs/2410.05076)
    - Paper: [Hyper-multi-step: The Truth Behind Difficult Long-context Tasks](https://arxiv.org/abs/2410.04422)
    - Paper: [From Reading to Compressing: Exploring the Multi-document Reader for Prompt Compression](https://arxiv.org/abs/2410.04139) EMNLP 2024
    - Paper: [Differential Transformer](https://arxiv.org/abs/2410.05258)

- **[2024.10.07]**
    - Paper: [UNComp: Uncertainty-Aware Long-Context Compressor for Efficient Large Language Model Inference](https://arxiv.org/abs/2410.03090)
    - Paper: [ALR2: A Retrieve-then-Reason Framework for Long-context Question Answering](https://arxiv.org/abs/2410.03227)
    - Paper: [LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy](https://arxiv.org/abs/2410.03111)

- **[2024.10.04]**
    - Paper: [HELMET: How to Evaluate Long-Context Language Models Effectively and Thoroughly](https://arxiv.org/abs/2410.02694)
    - Paper: [L-CiteEval: Do Long-Context Models Truly Leverage Context for Responding?](https://arxiv.org/abs/2410.02115)
    - Paper: [How to Train Long-Context Language Models (Effectively)](https://arxiv.org/abs/2410.02660)
    - Paper: [Selective Attention Improves Transformer](https://arxiv.org/abs/2410.02703)

- **[2024.10.03]**
    - Paper: [Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads](https://arxiv.org/abs/2410.01805)
    - Paper: [CreDes: Causal Reasoning Enhancement and Dual-End Searching for Solving Long-Range Reasoning Problems using LLMs](https://arxiv.org/abs/2410.01696)
    - Paper: [Bridging Context Gaps: Leveraging Coreference Resolution for Long Contextual Understanding](https://arxiv.org/abs/2410.01671)
    - Paper: [Efficient Long-range Language Modeling with Self-supervised Causal Retrieval](https://arxiv.org/abs/2410.01651)
    - Paper: [InfiniPot: Infinite Context Processing on Memory-Constrained LLMs](https://arxiv.org/abs/2410.01518)
    - Paper: [A Little Goes a Long Way: Efficient Long Context Training and Inference with Partial Contexts](https://arxiv.org/abs/2410.01485)

- **[2024.10.02]**
    - Paper: [VideoCLIP-XL: Advancing Long Description Understanding for Video CLIP Models](https://arxiv.org/abs/2410.00741)
    - Paper: [KV-Compress: Paged KV-Cache Compression with Variable Compression Rates per Attention Head](https://arxiv.org/abs/2410.00161)
    
- **[2024.10.01]**
    - Paper: [PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead](https://arxiv.org/abs/2409.19745)
    - Paper: [Perception Compressor:A training-free prompt compression method in long context scenarios](https://arxiv.org/abs/2409.19272)

- **[2024.09.27]**
    - Paper: [Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction](https://arxiv.org/abs/2409.17422)

- **[2024.09.26]**
    - Paper: [FineZip: Pushing the Limits of Large Language Models for Practical Lossless Text Compression](https://arxiv.org/abs/2409.17141)
    - Paper: [Multilingual Evaluation of Long Context Retrieval and Reasoning](https://arxiv.org/abs/2409.18006)

- **[2024.09.25]**
    - Paper: [Lighter And Better: Towards Flexible Context Adaptation For Retrieval Augmented Generation](https://arxiv.org/abs/2409.15699) CIKM 2024
    - Paper: [HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models](https://arxiv.org/abs/2409.16191)
    - Paper: [Parse Trees Guided LLM Prompt Compression](https://arxiv.org/abs/2409.15395)

- **[2024.09.24]**
    - Paper: [You Only Use Reactive Attention Slice For Long Context Retrieval](https://arxiv.org/abs/2409.13695)
    - Paper: [Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make your LLMs use External Data More Wisely](https://arxiv.org/abs/2409.14924)
    - Paper: [SMART-RAG: Selection using Determinantal Matrices for Augmented Retrieval](https://arxiv.org/abs/2409.13992)
    - Paper: [Inference-Friendly Models With MixAttention](https://arxiv.org/abs/2409.15012)

- **[2024.09.23]**
    - Paper: [Contextual Compression in Retrieval-Augmented Generation for Large Language Models: A Survey](https://arxiv.org/abs/2409.13385)
    - Paper: [TACO-RL: Task Aware Prompt Compression Optimization with Reinforcement Learning](https://arxiv.org/abs/2409.13035)

- **[2024.09.20]**
    - Paper: [RAD-Bench: Evaluating Large Language Models Capabilities in Retrieval Augmented Dialogues](https://arxiv.org/abs/2409.12558)
    - Paper: [Familiarity-aware Evidence Compression for Retrieval Augmented Generation](https://arxiv.org/abs/2409.12468)
    - Paper: [CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs](https://arxiv.org/abs/2409.12490)
    - Paper: [Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation](https://arxiv.org/abs/2409.12941)
    - Paper: [Michelangelo: Long Context Evaluations Beyond Haystacks via Latent Structure Queries](https://arxiv.org/abs/2409.12640)

- **[2024.09.19]**
    - Paper: [A Controlled Study on Long Context Extension and Generalization in LLMs](https://arxiv.org/abs/2409.12181)

- **[2024.09.18]**
    - Paper: [CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios](https://arxiv.org/abs/2409.10593)

 </details>

<h1 id="-papers">📜 Papers</h1>

<blockquote>
  <p>You can directly click on the title to jump to the corresponding PDF link location</p>
</blockquote>

<h2 id="1-survey-papers">1. Survey Papers</h2>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2302.14502"><strong>A Survey on Long Text Modeling with Transformers.</strong></a> <em>Zican Dong, Tianyi Tang, Lunyi Li, Wayne Xin Zhao.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2305.16259"><strong>Neural Natural Language Processing for Long Texts: A Survey of the State-of-the-Art.</strong></a> <em>Dimitrios Tsirmpas, Ioannis Gkionis, Ioannis Mademlis, Georgios Papadopoulos.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2311.12351"><strong>Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey.</strong></a> <em>Yunpeng Huang, Jingwei Xu, Zixu Jiang, Junyu Lai, Zenan Li, Yuan Yao, Taolue Chen, Lijuan Yang, Zhou Xin, Xiaoxing Ma.</em> Arxiv 2023.</p>
  </li>
</ol>

<p>        <a href="https://github.com/Strivin0311/long-llms-learning"><img src="https://img.shields.io/github/stars/Strivin0311/long-llms-learning" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2312.17044"><strong>Length Extrapolation of Transformers: A Survey from the Perspective of Position Encoding.</strong></a> <em>Liang Zhao, Xiaocheng Feng, Xiachong Feng, Bing Qin, Ting Liu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2401.07872"><strong>The What, Why, and How of Context Length Extension Techniques in Large Language Models – A Detailed Survey.</strong></a> <em>Saurav Pawar, S.M Towhidul Islam Tonmoy, S M Mehedi Zaman, Vinija Jain, Aman Chadha, Amitava Das.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2404.09516"><strong>State Space Model for New-Generation Network Alternative to Transformers: A Survey.</strong></a> <em>Xiao Wang, Shiao Wang, Yuhe Ding, Yuehang Li, Wentao Wu, Yao Rong, Weizhe Kong, Ju Huang, Shihao Li, Haoxiang Yang, Ziwen Wang, Bo Jiang, Chenglong Li, Yaowei Wang, Yonghong Tian, Jin Tang.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/Event-AHU/Mamba_State_Space_Model_Paper_List"><img src="https://img.shields.io/github/stars/Event-AHU/Mamba_State_Space_Model_Paper_List" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2404.14294"><strong>A Survey on Efficient Inference for Large Language Models.</strong></a> <em>Zixuan Zhou, Xuefei Ning, Ke Hong, Tianyu Fu, Jiaming Xu, Shiyao Li, Yuming Lou, Luning Wang, Zhihang Yuan, Xiuhong Li, Shengen Yan, Guohao Dai, Xiao-Ping Zhang, Yuhan Dong, Yu Wang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.06211"><strong>A Survey on RAG Meets LLMs: Towards Retrieval-Augmented Large Language Models.</strong></a> <em>Yujuan Ding, Wenqi Fan, Liangbo Ning, Shijie Wang, Hengyun Li, Dawei Yin, Tat-Seng Chua, Qing Li.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.07437"><strong>Evaluation of Retrieval-Augmented Generation: A Survey.</strong></a> <em>Hao Yu, Aoran Gan, Kai Zhang, Shiwei Tong, Qi Liu, Zhaofeng Liu.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/YHPeter/Awesome-RAG-Evaluation"><img src="https://img.shields.io/github/stars/YHPeter/Awesome-RAG-Evaluation" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2405.11299"><strong>The CAP Principle for LLM Serving: A Survey of Long-Context Large Language Model Serving.</strong></a> <em>Pai Zeng, Zhenyu Ning, Jieru Zhao, Weihao Cui, Mengwei Xu, Liwei Guo, Xusheng Chen, Yizhou Shan.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.18003"><strong>Keep the Cost Down: A Review on Methods to Optimize LLM’ s KV-Cache Consumption.</strong></a> <em>Luohe Shi, Hongyi Zhang, Yao Yao, Zuchao Li, Hai Zhao.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/zcli-charlie/Awesome-KV-Cache"><img src="https://img.shields.io/github/stars/zcli-charlie/Awesome-KV-Cache" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2409.13385"><strong>Contextual Compression in Retrieval-Augmented Generation for Large Language Models: A Survey.</strong></a> <em>Sourav Verma.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/SrGrace/Contextual-Compression"><img src="https://img.shields.io/github/stars/SrGrace/Contextual-Compression" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2409.14924"><strong>Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make your LLMs use External Data More Wisely.</strong></a> <em>Siyun Zhao, Yuqing Yang, Zilong Wang, Zhiyuan He, Luna K. Qiu, Lili Qiu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.12388"><strong>Prompt Compression for Large Language Models: A Survey.</strong></a> <em>Zongqian Li, Yinhong Liu, Yixuan Su, Nigel Collier.</em> Arxiv 2024.</p>
  </li>
</ol>

<h2 id="2-efficient-attention">2. Efficient Attention</h2>

<h3 id="21-sparse-attention">2.1 Sparse Attention</h3>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/1904.10509"><strong>Generating Long Sequences with Sparse Transformers.</strong></a> <em>Rewon Child, Scott Gray, Alec Radford, Ilya Sutskever.</em> Arxiv 2019.</p>
  </li>
  <li>
    <p><a href="https://aclanthology.org/2020.findings-emnlp.232/"><strong>Blockwise selfattention for long document understanding.</strong></a> <em>Jiezhong Qiu, Hao Ma, Omer Levy, Wen-tau Yih, Sinong Wang, Jie Tang.</em> EMNLP 2020.</p>
  </li>
</ol>

<p>        <a href="https://github.com/xptree/BlockBERT"><img src="https://img.shields.io/github/stars/xptree/BlockBERT" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2004.05150"><strong>Longformer: The Long-Document Transformer.</strong></a> <em>Iz Beltagy, Matthew E. Peters, Arman Cohan.</em> Arxiv 2020.</li>
</ol>

<p>        <a href="https://github.com/allenai/longformer"><img src="https://img.shields.io/github/stars/allenai/longformer" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://aclanthology.org/2020.emnlp-main.19/"><strong>ETC: Encoding Long and Structured Inputs in Transformers.</strong></a> <em>Joshua Ainslie, Santiago Ontanon, Chris Alberti, Vaclav Cvicek, Zachary Fisher, Philip Pham, Anirudh Ravula, Sumit Sanghai, Qifan Wang, Li Yang.</em> EMNLP 2020.</p>
  </li>
  <li>
    <p><a href="https://papers.nips.cc/paper/2020/hash/c8512d142a2d849725f31a9a7a361ab9-Abstract.html"><strong>Big Bird: Transformers for Longer Sequences.</strong></a> <em>Manzil Zaheer, Guru Guruganesh, Kumar Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, Amr Ahmed.</em> NeurIPS 2020.</p>
  </li>
</ol>

<p>        <a href="https://github.com/google-research/bigbird"><img src="https://img.shields.io/github/stars/google-research/bigbird" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2001.04451"><strong>Reformer: The efficient transformer.</strong></a>  <em>Nikita Kitaev, Łukasz Kaiser, Anselm Levskaya.</em> ICLR 2020.</li>
</ol>

<p>        <a href="https://github.com/lucidrains/reformer-pytorch"><img src="https://img.shields.io/github/stars/lucidrains/reformer-pytorch" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2002.11296"><strong>Sparse Sinkhorn Attention.</strong></a> <em>Yi Tay, Dara Bahri, Liu Yang, Donald Metzler, Da-Cheng Juan.</em> ICML 2020.</li>
</ol>

<p>        <a href="https://github.com/lucidrains/sinkhorn-transformer"><img src="https://img.shields.io/github/stars/lucidrains/sinkhorn-transformer" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2006.07214"><strong>Sparse and continuous attention mechanisms.</strong></a> <em>André F. T. Martins, António Farinhas, Marcos Treviso, Vlad Niculae, Pedro M. Q. Aguiar, Mário A. T. Figueiredo.</em> NIPS 2020.</p>
  </li>
  <li>
    <p><a href="https://aclanthology.org/2021.tacl-1.4/"><strong>Efficient Content-Based Sparse Attention with Routing Transformers.</strong></a> <em>Aurko Roy, Mohammad Saffar, Ashish Vaswani, David Grangier.</em> TACL 2021.</p>
  </li>
</ol>

<p>        <a href="https://github.com/lucidrains/routing-transformer"><img src="https://img.shields.io/github/stars/lucidrains/routing-transformer" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://aclanthology.org/2022.findings-naacl.55/"><strong>LongT5: Efficient text-to-text transformer for long sequences.</strong></a> <em>Mandy Guo, Joshua Ainslie, David Uthus, Santiago Ontanon, Jianmo Ni, Yun-Hsuan Sung, Yinfei Yang.</em> NAACL 2022.</li>
</ol>

<p>        <a href="https://github.com/google-research/longt5"><img src="https://img.shields.io/github/stars/google-research/longt5" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://aclanthology.org/2023.tacl-1.17/"><strong>Efficient Long-Text Understanding with Short-Text Models.</strong></a> <em>Maor Ivgi, Uri Shaham, Jonathan Berant.</em> TACL 2023.</li>
</ol>

<p>        <a href="https://github.com/Mivg/SLED"><img src="https://img.shields.io/github/stars/Mivg/SLED" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://aclanthology.org/2023.acl-long.352/"><strong>Parallel Context Windows for Large Language Models.</strong></a> <em>Nir Ratner, Yoav Levine, Yonatan Belinkov, Ori Ram, Inbal Magar, Omri Abend, Ehud Karpas, Amnon Shashua, Kevin Leyton-Brown, Yoav Shoham.</em> ACL 2023.</li>
</ol>

<p>        <a href="https://github.com/AI21Labs/Parallel-Context-Windows"><img src="https://img.shields.io/github/stars/AI21Labs/Parallel-Context-Windows" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2305.01625"><strong>Unlimiformer: Long-Range Transformers with Unlimited Length Input.</strong></a> <em>Amanda Bertsch, Uri Alon, Graham Neubig, Matthew R. Gormley.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/abertsch72/unlimiformer"><img src="https://img.shields.io/github/stars/abertsch72/unlimiformer" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2305.16300"><strong>Landmark Attention: Random-Access Infinite Context Length for Transformers.</strong></a> <em>Amirkeivan Mohtashami, Martin Jaggi</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/epfml/landmark-attention"><img src="https://img.shields.io/github/stars/epfml/landmark-attention" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2307.02486"><strong>LONGNET: Scaling Transformers to 1,000,000,000 Tokens.</strong></a> <em>Jiayu Ding, Shuming Ma, Li Dong, Xingxing Zhang, Shaohan Huang, Wenhui Wang, Nanning Zheng, Furu Wei.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/kyegomez/LongNet"><img src="https://img.shields.io/github/stars/kyegomez/LongNet" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2305.14788"><strong>Adapting Language Models to Compress Contexts.</strong></a> <em>Alexis Chevalier, Alexander Wettig, Anirudh Ajith, Danqi Chen.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/princeton-nlp/AutoCompressors"><img src="https://img.shields.io/github/stars/princeton-nlp/AutoCompressors" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2305.19370"><strong>Blockwise Parallel Transformer for Long Context Large Models.</strong></a> <em>Hao Liu, Pieter Abbeel.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/lhao499/llm_large_context"><img src="https://img.shields.io/github/stars/kyegomez/Blockwise-Parallel-Transformer" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2305.07185"><strong>MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers.</strong></a> <em>Lili Yu, Dániel Simig, Colin Flaherty, Armen Aghajanyan, Luke Zettlemoyer, Mike Lewis.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/lucidrains/MEGABYTE-pytorch"><img src="https://img.shields.io/github/stars/lucidrains/MEGABYTE-pytorch" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2305.15805"><strong>Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers.</strong></a> <em>Sotiris Anagnostidis, Dario Pavllo, Luca Biggio, Lorenzo Noci, Aurelien Lucchi, Thomas Hofmann.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2306.13421"><strong>Long-range Language Modeling with Self-retrieval.</strong></a> <em>Ohad Rubin, Jonathan Berant.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2306.13596"><strong>Max-Margin Token Selection in Attention Mechanism.</strong></a> <em>Davoud Ataee Tarzanagh, Yingcong Li, Xuechen Zhang, Samet Oymak.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2308.13191"><strong>Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers.</strong></a> <em>Jiawen Xie, Pengyu Cheng, Xiao Liang, Yong Dai, Nan Du.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://openreview.net/forum?id=VV0hSE8AxCw"><strong>Sparse Token Transformer with Attention Back Tracking.</strong></a> <em>Heejun Lee, Minki Kang, Youngwan Lee, Sung Ju Hwang.</em> ICLR 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/pdf/2307.13365v2.pdf"><strong>Empower Your Model with Longer and Better Context Comprehension.</strong></a> <em>YiFei Gao, Lei Wang, Jun Fang, Longhua Hu, Jun Cheng.</em> Arxiv 2023.</p>
  </li>
</ol>

<p>        <a href="https://github.com/yileijin/attention-transition"><img src="https://img.shields.io/github/stars/yileijin/attention-transition" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/pdf/2310.01889v1.pdf"><strong>Ring Attention with Blockwise Transformers for Near-Infinite Context.</strong></a> <em>Hao Liu, Matei Zaharia, Pieter Abbeel.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/pdf/2309.17453.pdf"><strong>Efficient Streaming Language Models with Attention Sinks.</strong></a> <em>Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis.</em> Arxiv 2023.</p>
  </li>
</ol>

<p>        <a href="https://github.com/mit-han-lab/streaming-llm"><img src="https://img.shields.io/github/stars/mit-han-lab/streaming-llm" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2310.05869"><strong>HyperAttention: Long-context Attention in Near-Linear Time.</strong></a> <em>Insu Han, Rajesh Jayaram, Amin Karbasi, Vahab Mirrokni, David P. Woodruff, Amir Zandieh.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/pdf/2311.07102v1.pdf"><strong>Fovea Transformer: Efficient Long-Context Modeling with Structured Fine-to-Coarse Attention.</strong></a> <em>Ziwei He,Jian Yuan,Le Zhou,Jingwen Leng,Bo Jiang.</em> Arxiv 2023.</p>
  </li>
</ol>

<p>        <a href="https://github.com/ZiweiHe/Fovea-Transformer"><img src="https://img.shields.io/github/stars/ZiweiHe/Fovea-Transformer" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2402.15220"><strong>ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition.</strong></a> <em>Lu Ye, Ze Tao, Yong Huang, Yang Li.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2402.17463"><strong>Training-Free Long-Context Scaling of Large Language Models.</strong></a> <em>Chenxin An, Fei Huang, Jun Zhang, Shansan Gong, Xipeng Qiu, Chang Zhou, Lingpeng Kong.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/HKUNLP/ChunkLlama"><img src="https://img.shields.io/github/stars/HKUNLP/ChunkLlama" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2402.10685"><strong>LongHeads: Multi-Head Attention is Secretly a Long Context Processor.</strong></a> <em>Yi Lu, Xin Zhou, Wei He, Jun Zhao, Tao Ji, Tao Gui, Qi Zhang, Xuanjing Huang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2312.08618"><strong>Zebra: Extending Context Window with Layerwise Grouped Local-Global Attention.</strong></a> <em>Kaiqiang Song, Xiaoyang Wang, Sangwoo Cho, Xiaoman Pan, Dong Yu.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2404.14469"><strong>SnapKV: LLM Knows What You are Looking for Before Generation.</strong></a> <em>Yuhong Li, Yingbing Huang, Bowen Yang, Bharat Venkitesh, Acyr Locatelli, Hanchen Ye, Tianle Cai, Patrick Lewis, Deming Chen.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/FasterDecoding/SnapKV"><img src="https://img.shields.io/github/stars/FasterDecoding/SnapKV" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2404.15949"><strong>Sequence can Secretly Tell You What to Discard.</strong></a> <em>Jincheng Dai, Zhuowei Huang, Haiyun Jiang, Chen Chen, Deng Cai, Wei Bi, Shuming Shi.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.05678"><strong>SinkLoRA: Enhanced Efficiency and Chat Capabilities for Long-Context Large Language Models.</strong></a> <em>Hengyu Zhang.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/Dexter-GT-86/SinkLoRA"><img src="https://img.shields.io/github/stars/Dexter-GT-86/SinkLoRA" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2406.09827"><strong>HiP Attention: Sparse Sub-Quadratic Attention with Hierarchical Attention Pruning.</strong></a> <em>Heejun Lee, Geon Park, Youngwan Lee, Jina Kim, Wonyoung Jeong, Myeongjae Jeon, Sung Ju Hwang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.10985"><strong>Taking a Deep Breath: Enhancing Language Modeling of Large Language Models with Sentinel Tokens.</strong></a> <em>Weiyao Luo, Suncong Zheng, Heming Xia, Weikang Wang, Yan Lei, Tianyu Liu, Shuang Chen, Zhifang Sui.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.14909"><strong>MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression.</strong></a> <em>Weiyao Luo, Suncong Zheng, Heming Xia, Weikang Wang, Yan Lei, Tianyu Liu, Shuang Chen, Zhifang Sui.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.16747"><strong>Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers.</strong></a> <em>Chao Lou, Zixia Jia, Zilong Zheng, Kewei Tu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.15486"><strong>Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention.</strong></a> <em>Qianchao Zhu, Jiangfei Duan, Chang Chen, Siran Liu, Xiuhong Li, Guanyu Feng, Xin Lv, Huanqi Cao, Xiao Chuanfu, Xingcheng Zhang, Dahua Lin, Chao Yang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.02486"><strong>Neurocache: Efficient Vector Retrieval for Long-range Language Modeling.</strong></a> <em>Ali Safaya, Deniz Yuret.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/alisafaya/neurocache"><img src="https://img.shields.io/github/stars/alisafaya/neurocache" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2407.10855"><strong>Weighted Grouped Query Attention in Transformers.</strong></a> <em>Sai Sena Chinnakonduru, Astarag Mohapatra.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.02703"><strong>Selective Attention Improves Transformer.</strong></a> <em>Yaniv Leviathan, Matan Kalman, Yossi Matias.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.05076"><strong>TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention.</strong></a> <em>Lijie Yang, Zhihao Zhang, Zhuofu Chen, Zikun Li, Zhihao Jia.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/DerrickYLJ/TidalDecode"><img src="https://img.shields.io/github/stars/DerrickYLJ/TidalDecode" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2410.06886"><strong>FltLM: An Intergrated Long-Context Large Language Model for Effective Context Filtering and Understanding.</strong></a> <em>Jingyang Deng, Zhengyang Shen, Boyang Wang, Lixin Su, Suqi Cheng, Ying Nie, Junfeng Wang, Dawei Yin, Jinwen Ma.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.11261"><strong>Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix.</strong></a> <em>Yingyu Liang, Jiangxuan Long, Zhenmei Shi, Zhao Song, Yufa Zhou.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.08971"><strong>Extra Global Attention Designation Using Keyword Detection in Sparse Transformer Architectures.</strong></a> <em>Evan Lucas, Dylan Kangas, Timothy C Havens.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.13276"><strong>SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs.</strong></a> <em>Yizhao Gao, Zhichen Zeng, Dayou Du, Shijie Cao, Hayden Kwok-Hay So, Ting Cao, Fan Yang, Mao Yang.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/microsoft/SeerAttention"><img src="https://img.shields.io/github/stars/microsoft/SeerAttention" alt="GitHub Repo stars" /></a></p>

<h3 id="22-linear-attention">2.2 Linear Attention</h3>

<ol>
  <li><a href="https://arxiv.org/abs/2006.16236"><strong>Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention.</strong></a> <em>Angelos Katharopoulos, Apoorv Vyas, Nikolaos Pappas, François Fleuret.</em> ICML 2020.</li>
</ol>

<p>        <a href="https://github.com/idiap/fast-transformers"><img src="https://img.shields.io/github/stars/idiap/fast-transformers" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/1903.05895"><strong>Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations.</strong></a> <em>Tri Dao, Albert Gu, Matthew Eichhorn, Atri Rudra, Christopher Ré.</em> Arxiv 2019.</li>
</ol>

<p>        <a href="https://github.com/HazyResearch/butterfly"><img src="https://img.shields.io/github/stars/HazyResearch/butterfly" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2006.03555"><strong>Masked language modeling for proteins via linearly scalable long-context transformers.</strong></a> <em>Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, David Belanger, Lucy Colwell, Adrian Weller.</em> Arxiv 2020.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2009.14794"><strong>Rethinking attention with performers.</strong></a> <em>Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, David Belanger, Lucy Colwell, Adrian Weller.</em> Arxiv 2020.</p>
  </li>
</ol>

<p>        <a href="https://github.com/lucidrains/performer-pytorch"><img src="https://img.shields.io/github/stars/lucidrains/performer-pytorch" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2006.04768"><strong>Linformer: Self-attention with linear complexity.</strong></a> <em>Sinong Wang, Belinda Z. Li, Madian Khabsa, Han Fang, Hao Ma.</em> Arxiv 2020.</li>
</ol>

<p>        <a href="https://github.com/lucidrains/linear-attention-transformer"><img src="https://img.shields.io/github/stars/lucidrains/linear-attention-transformer" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2103.02143"><strong>Random Feature Attention.</strong></a> <em>Hao Peng, Nikolaos Pappas, Dani Yogatama, Roy Schwartz, Noah A. Smith, Lingpeng Kong.</em> Arxiv 2021.</li>
</ol>

<p>        <a href="https://github.com/Noahs-ARK/RFA"><img src="https://img.shields.io/github/stars/Noahs-ARK/RFA" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2106.01540"><strong>Luna: Linear unified nested attention.</strong></a> <em>Xuezhe Ma, Xiang Kong, Sinong Wang, Chunting Zhou, Jonathan May, Hao Ma, Luke Zettlemoyer.</em> Arxiv 2021.</li>
</ol>

<p>        <a href="https://github.com/sooftware/luna-transformer"><img src="https://img.shields.io/github/stars/sooftware/luna-transformer" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2105.03824"><strong>Fnet: Mixing tokens with fourier transforms.</strong></a> <em>James Lee-Thorp, Joshua Ainslie, Ilya Eckstein, Santiago Ontanon.</em> Arxiv 2021.</li>
</ol>

<p>        <a href="https://github.com/jaketae/fnet"><img src="https://img.shields.io/github/stars/jaketae/fnet" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2312.06635v2"><strong>Gated Linear Attention Transformers with Hardware-Efficient Training.</strong></a> <em>Songlin Yang, Bailin Wang, Yikang Shen, Rameswar Panda, Yoon Kim.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/berlino/gated_linear_attention"><img src="https://img.shields.io/github/stars/berlino/gated_linear_attention" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2402.17512"><strong>Latent Attention for Linear Time Transformers.</strong></a> <em>Rares Dolga, Marius Cobzarenco, David Barber.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2402.18668"><strong>Simple linear attention language models balance the recall-throughput tradeoff.</strong></a> <em>Simran Arora, Sabri Eyuboglu, Michael Zhang, Aman Timalsina, Silas Alberti, Dylan Zinsley, James Zou, Atri Rudra, Christopher Ré.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/HazyResearch/based"><img src="https://img.shields.io/github/stars/HazyResearch/based" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2404.02882"><strong>Linear Attention Sequence Parallelism.</strong></a> <em>Weigao Sun, Zhen Qin, Dong Li, Xuyang Shen, Yu Qiao, Yiran Zhong.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/OpenNLPLab/LASP"><img src="https://img.shields.io/github/stars/OpenNLPLab/LASP" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2404.05843"><strong>Softmax Attention with Constant Cost per Token.</strong></a> <em>Franz A. Heinsen.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/glassroom/heinsen_attention">![GitHub Repo stars](https://img.shields.io/github/stars/glassroom/heinsen_attention</a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2404.08801"><strong>Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length.</strong></a> <em>Xuezhe Ma, Xiaomeng Yang, Wenhan Xiong, Beidi Chen, Lili Yu, Hao Zhang, Jonathan May, Luke Zettlemoyer, Omer Levy, Chunting Zhou.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/XuezheMax/megalodon">![GitHub Repo stars](https://img.shields.io/github/stars/XuezheMax/megalodon</a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2405.17381"><strong>Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention.</strong></a> <em>Zhen Qin, Weigao Sun, Dong Li, Xuyang Shen, Weixuan Sun, Yiran Zhong.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.17383"><strong>Unlocking the Secrets of Linear Complexity Sequence Model from A Unified Perspective.</strong></a> <em>Zhen Qin, Xuyang Shen, Weigao Sun, Dong Li, Stan Birchfield, Richard Hartley, Yiran Zhong.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.13956"><strong>Attention as an RNN.</strong></a> <em>Leo Feng, Frederick Tung, Hossein Hajimirsadeghi, Mohamed Osama Ahmed, Yoshua Bengio, Greg Mori.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.21022"><strong>You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet.</strong></a> <em>Zhen Qin, Yuxin Mao, Xuyang Shen, Dong Li, Jing Zhang, Yuchao Dai, Yiran Zhong.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/OpenNLPLab/LightNet">![GitHub Repo stars](https://img.shields.io/github/stars/OpenNLPLab/LightNet</a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.07368"><strong>When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models.</strong></a> <em>Haoran You, Yichao Fu, Zheng Wang, Amir Yazdanbakhsh, Yingyan (Celine)Lin.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/GATECH-EIC/Linearized-LLM">![GitHub Repo stars](https://img.shields.io/github/stars/GATECH-EIC/Linearized-LLM</a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2407.04620"><strong>Learning to (Learn at Test Time): RNNs with Expressive Hidden States.</strong></a> <em>Yu Sun, Xinhao Li, Karan Dalal, Jiarui Xu, Arjun Vikram, Genghan Zhang, Yann Dubois, Xinlei Chen, Xiaolong Wang, Sanmi Koyejo, Tatsunori Hashimoto, Carlos Guestrin.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/test-time-training/ttt-lm-pytorch">![GitHub Repo stars](https://img.shields.io/github/stars/test-time-training/ttt-lm-pytorch</a>
        <a href="https://github.com/test-time-training/ttt-lm-jax">![GitHub Repo stars](https://img.shields.io/github/stars/test-time-training/ttt-lm-jax</a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2409.07146"><strong>Gated Slot Attention for Efficient Linear-Time Sequence Modeling.</strong></a> <em>Yu Zhang, Songlin Yang, Ruijie Zhu, Yue Zhang, Leyang Cui, Yiqiao Wang, Bolun Wang, Freda Shi, Bailin Wang, Wei Bi, Peng Zhou, Guohong Fu.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/sustcsonglin/flash-linear-attention">![GitHub Repo stars](https://img.shields.io/github/stars/sustcsonglin/flash-linear-attention</a></p>

<h3 id="23-hierarchical-attention">2.3 Hierarchical Attention</h3>

<ol>
  <li><a href="https://aclanthology.org/P19-1424.pdf"><strong>Neural Legal Judgment Prediction in English.</strong></a> <em>Ilias Chalkidis, Ion Androutsopoulos, Nikolaos Aletras.</em> ACL 2019.</li>
</ol>

<p>        <a href="https://github.com/PolarisRisingWar/pytorch_ljp"><img src="https://img.shields.io/github/stars/PolarisRisingWar/pytorch_ljp" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2201.06774"><strong>Hierarchical Neural Network Approaches for Long Document Classification.</strong></a> <em>Snehal Khandve, Vedangi Wagh, Apurva Wani, Isha Joshi, Raviraj Joshi.</em> ICML 2022.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2106.01040"><strong>Hi-transformer: Hierarchical interactive transformer for efficient and effective long document modeling.</strong></a> <em>Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang.</em> ACL-IJCNLP 2021</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2203.12276"><strong>Erniesparse: Learning hierarchical efficient transformer through regularized self-attention.</strong></a> <em>Yang Liu, Jiaxiang Liu, Li Chen, Yuxiang Lu, Shikun Feng, Zhida Feng, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang.</em> Arxiv 2022.</p>
  </li>
</ol>

<h3 id="24-io-aware-attention">2.4 IO-Aware Attention</h3>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2112.05682"><strong>Self-attention Does Not Need O(n^2) Memory.</strong></a> <em>Markus N. Rabe, Charles Staats.</em> Arxiv 2021.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2306.01160"><strong>Faster Causal Attention Over Large Sequences Through Sparse Flash Attention.</strong></a> <em>Matteo Pagliardini, Daniele Paliotta, Martin Jaggi, François Fleuret.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2205.14135"><strong>FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness.</strong></a> <em>Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré.</em> Arxiv 2022.</p>
  </li>
</ol>

<p>        <a href="https://github.com/Dao-AILab/flash-attention"><img src="https://img.shields.io/github/stars/Dao-AILab/flash-attention" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2307.08691"><strong>FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning.</strong></a> <em>Tri Dao.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/Dao-AILab/flash-attention"><img src="https://img.shields.io/github/stars/Dao-AILab/flash-attention" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2309.06180"><strong>Efficient Memory Management for Large Language Model Serving with PagedAttention.</strong></a> <em>Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph E. Gonzalez, Hao Zhang, Ion Stoica.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/vllm-project/vllm"><img src="https://img.shields.io/github/stars/vllm-project/vllm" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2307.14995"><strong>TransNormerLLM: A Faster and Better Large Language Model with Improved TransNormer.</strong></a> <em>Zhen Qin, Dong Li, Weigao Sun, Weixuan Sun, Xuyang Shen, Xiaodong Han, Yunshen Wei, Baohong Lv, Xiao Luo, Yu Qiao, Yiran Zhong.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/OpenNLPLab/TransnormerLLM"><img src="https://img.shields.io/github/stars/OpenNLPLab/TransnormerLLM" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2401.04695"><strong>Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models.</strong></a> <em>Zhen Qin, Weigao Sun, Dong Li, Xuyang Shen, Weixuan Sun, Yiran Zhong.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/OpenNLPLab/lightning-attention"><img src="https://img.shields.io/github/stars/OpenNLPLab/lightning-attention" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2402.15220"><strong>ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition.</strong></a> <em>Lu Ye, Ze Tao, Yong Huang, Yang Li.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2404.14469"><strong>SnapKV: LLM Knows What You are Looking for Before Generation.</strong></a> <em>Yuhong Li, Yingbing Huang, Bowen Yang, Bharat Venkitesh, Acyr Locatelli, Hanchen Ye, Tianle Cai, Patrick Lewis, Deming Chen.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/FasterDecoding/SnapKV"><img src="https://img.shields.io/github/stars/FasterDecoding/SnapKV" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://openreview.net/forum?id=uNrFpDPMyo"><strong>Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs.</strong></a> <em>Suyu Ge, Yunan Zhang, Liyuan Liu, Minjia Zhang, Jiawei Han, Jianfeng Gao.</em> ICLR 2024 Oral.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2403.09054"><strong>Keyformer: KV Cache Reduction through Key Tokens Selection for Efficient Generative Inference.</strong></a> <em>Muhammad Adnan, Akhil Arunkumar, Gaurav Jain, Prashant J. Nair, Ilya Soloveychik, Purushotham Kamath.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2404.18057"><strong>Efficient LLM Inference with Kcache.</strong></a> <em>Qiaozhi He, Zhihua Wu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.05254"><strong>You Only Cache Once: Decoder-Decoder Architectures for Language Models.</strong></a> <em>Yutao Sun, Li Dong, Yi Zhu, Shaohan Huang, Wenhui Wang, Shuming Ma, Quanlu Zhang, Jianyong Wang, Furu Wei.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/microsoft/unilm/tree/master/YOCO"><img src="https://img.shields.io/github/stars/microsoft/unilm" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/1911.02150"><strong>Fast Transformer Decoding: One Write-Head is All You Need.</strong></a> <em>Noam Shazeer.</em> Arxiv 2019.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2305.13245"><strong>GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints.</strong></a> <em>Joshua Ainslie, James Lee-Thorp, Michiel de Jong, Yury Zemlyanskiy, Federico Lebrón, Sumit Sanghai.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.04434"><strong>DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model.</strong></a> <em>DeepSeek-AI.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/deepseek-ai/DeepSeek-V2"><img src="https://img.shields.io/github/stars/deepseek-ai/DeepSeek-V2" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2405.10637"><strong>Layer-Condensed KV Cache for Efficient Inference of Large Language Models.</strong></a> <em>Haoyi Wu, Kewei Tu.</em> ACL 2024.</li>
</ol>

<p>        <a href="https://github.com/whyNLP/LCKV"><img src="https://img.shields.io/github/stars/whyNLP/LCKV" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2405.12981"><strong>Reducing Transformer Key-Value Cache Size with Cross-Layer Attention.</strong></a> <em>William Brandon, Mayank Mishra, Aniruddha Nrusimha, Rameswar Panda, Jonathan Ragan Kelly.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.12532"><strong>PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference.</strong></a> <em>William Brandon, Mayank Mishra, Aniruddha Nrusimha, Rameswar Panda, Jonathan Ragan Kelly.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/mutonix/pyramidinfer"><img src="https://img.shields.io/github/stars/mutonix/pyramidinfer" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2405.12591"><strong>Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression.</strong></a> <em>Peiyu Liu, Ze-Feng Gao, Wayne Xin Zhao, Yipeng Ma, Tao Wang, Ji-Rong Wen.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.14366"><strong>MiniCache: KV Cache Compression in Depth Dimension for Large Language Models.</strong></a> <em>Akide Liu, Jing Liu, Zizheng Pan, Yefei He, Gholamreza Haffari, Bohan Zhuang.</em> NeurIPS 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.02069"><strong>PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling.</strong></a> <em>Zefan Cai., Yichi Zhang, Bofei Gao, Tianyu Liu, Keming Lu, Wayne Xiong, Yue Dong, Baobao Chang, Junjie Hu, Wen Xiao.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.07056"><strong>Effectively Compress KV Heads for LLM.</strong></a> <em>Hao Yu, Zelan Yang, Shen Li, Yong Li, Jianxin Wu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.11430"><strong>A Simple and Effective L2 Norm-Based Strategy for KV Cache Compression.</strong></a> <em>Alessio Devoto, Yu Zhao, Simone Scardapane, Pasquale Minervini.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.10774"><strong>Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference.</strong></a> <em>Jiaming Tang, Yilong Zhao, Kan Zhu, Guangxuan Xiao, Baris Kasikci, Song Han.</em> ICML 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/mit-han-lab/Quest"><img src="https://img.shields.io/github/stars/mit-han-lab/Quest" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2406.12335"><strong>Attention Score is not All You Need for Token Importance Indicator in KV Cache Reduction: Value Also Matters.</strong></a> <em>Zhiyu Guo, Hidetaka Kamigaito, Taro Watanabe.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.12018"><strong>CItruS: Chunked Instruction-aware State Eviction for Long Sequence Modeling.</strong></a> <em>Yu Bai, Xiyuan Zou, Heyan Huang, Sanxing Chen, Marc-Antoine Rondeau, Yang Gao, Jackie Chi Kit Cheung.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.13035"><strong>D2O: Dynamic Discriminative Operations for Efficient Generative Inference of Large Language Models.</strong></a> <em>Zhongwei Wan, Xinjian Wu, Yu Zhang, Yi Xin, Chaofan Tao, Zhihong Zhu, Xin Wang, Siqi Luo, Jing Xiong, Mi Zhang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.14909"><strong>MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression.</strong></a> <em>Weiyao Luo, Suncong Zheng, Heming Xia, Weikang Wang, Yan Lei, Tianyu Liu, Shuang Chen, Zhifang Sui.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.18139"><strong>LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference.</strong></a> <em>Zhongwei Wan, Ziang Wu, Che Liu, Jinfa Huang, Zhihong Zhu, Peng Jin, Longyue Wang, Li Yuan.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/SUSTechBruce/LOOK-M"><img src="https://img.shields.io/github/stars/SUSTechBruce/LOOK-M" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2406.17808"><strong>Training-Free Exponential Extension of Sliding Window Context with Cascading KV Cache.</strong></a> <em>Jeffrey Willette, Heejun Lee, Youngwan Lee, Myeongjae Jeon, Sung Ju Hwang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.07528"><strong>QuickLLaMA: Query-aware Inference Acceleration for Large Language Models.</strong></a> <em>Jingyao Li, Han Shi, Xin Jiang, Zhenguo Li, Hong Xu, Jiaya Jia.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/dvlab-research/Q-LLM"><img src="https://img.shields.io/github/stars/dvlab-research/Q-LLM" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2407.02490"><strong>MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention.</strong></a> <em>Huiqiang Jiang, Yucheng Li, Chengruidong Zhang, Qianhui Wu, Xufang Luo, Surin Ahn, Zhenhua Han, Amir H. Abdi, Dongsheng Li, Chin-Yew Lin, Yuqing Yang, Lili Qiu.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/microsoft/MInference"><img src="https://img.shields.io/github/stars/microsoft/MInference" alt="GitHub Repo stars" /></a>
        <a href="https://hqjiang.com/minference.html"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2407.08454"><strong>Model Tells You Where to Merge: Adaptive KV Cache Merging for LLMs on Long-Context Tasks.</strong></a> <em>Zheng Wang, Boxiao Jin, Zhongzhi Yu, Minjia Zhang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.11550"><strong>Optimizing KV Cache Eviction in LLMs: Adaptive Allocation for Enhanced Budget Utilization.</strong></a> <em>Yuan Feng, Junlin Lv, Yukun Cao, Xike Xie, S. Kevin Zhou.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.12866"><strong>Beyond KV Caching: Shared Attention for Efficient LLMs.</strong></a> <em>Bingli Liao, Danilo Vasconcellos Vargas.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/metacarbon/shareAtt"><img src="https://img.shields.io/github/stars/metacarbon/shareAtt" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2407.12820"><strong>PQCache: Product Quantization-based KVCache for Long Context LLM Inference.</strong></a> <em>Hailin Zhang, Xiaodong Ji, Yilin Chen, Fangcheng Fu, Xupeng Miao, Xiaonan Nie, Weipeng Chen, Bin Cui.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.14057"><strong>LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference.</strong></a> <em>Qichen Fu, Minsik Cho, Thomas Merth, Sachin Mehta, Mohammad Rastegari, Mahyar Najibi.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.15176"><strong>Farewell to Length Extrapolation, a Training-Free Infinite Context with Finite Attention Scope.</strong></a> <em>Xiaoran Liu, Qipeng Guo, Yuerong Song, Zhigeng Liu, Kai Lv, Hang Yan, Linlin Li, Qun Liu, Xipeng Qiu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.15891"><strong>RazorAttention: Efficient KV Cache Compression Through Retrieval Heads.</strong></a> <em>Hanlin Tang, Yang Lin, Jing Lin, Qingsen Han, Shikuan Hong, Yiwu Yao, Gongyi Wang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.08608"><strong>FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision.</strong></a> <em>Jay Shah, Ganesh Bikshandi, Ying Zhang, Vijay Thakkar, Pradeep Ramani, Tri Dao.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.21018"><strong>ThinK: Thinner Key Cache by Query-Driven Pruning.</strong></a> <em>Yuhui Xu, Zhanming Jie, Hanze Dong, Lei Wang, Xudong Lu, Aojun Zhou, Amrita Saha, Caiming Xiong, Doyen Sahoo.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.20485"><strong>A2SF: Accumulative Attention Scoring with Forgetting Factor for Token Pruning in Transformer Decoder.</strong></a> <em>Hyun-rae Jo, Dongkun Shin.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/Dirac-Notation/A2SF"><img src="https://img.shields.io/github/stars/Dirac-Notation/A2SF" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2408.01890"><strong>Cross-layer Attention Sharing for Large Language Models.</strong></a> <em>Yongyu Mu, Yuzhang Wu, Yuchun Fan, Chenglong Wang, Hengyu Li, Qiaozhi He, Murun Yang, Tong Xiao, Jingbo Zhu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2408.03675"><strong>NACL: A General and Effective KV Cache Eviction Framework for LLMs at Inference Time.</strong></a> <em>Yilong Chen, Guoxia Wang, Junyuan Shang, Shiyao Cui, Zhenyu Zhang, Tingwen Liu, Shuohuan Wang, Yu Sun, Dianhai Yu, Hua Wu.</em> ACL 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/PaddlePaddle/Research/tree/master/NLP/ACL2024-NACL"><img src="https://img.shields.io/github/stars/PaddlePaddle/Research" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2408.04093"><strong>Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters.</strong></a> <em>Vasudev Shyam, Jonathan Pilault, Emily Shepperd, Quentin Anthony, Beren Millidge.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/Zyphra/tree_attention"><img src="https://img.shields.io/github/stars/Zyphra/tree_attention" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2408.11049"><strong>MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding.</strong></a> <em>Jian Chen, Vashisth Tiwari, Ranajoy Sadhukhan, Zhuoming Chen, Jinyuan Shi, Ian En-Hsu Yen, Beidi Chen.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/Infini-AI-Lab/MagicDec/"><img src="https://img.shields.io/github/stars/Infini-AI-Lab/MagicDec" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2409.10593"><strong>CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios.</strong></a> <em>Luning Wang, Shiyao Li, Xuefei Ning, Zhihang Yuan, Shengen Yan, Guohao Dai, Yu Wang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/wln20/cskv/"><img src="https://img.shields.io/github/stars/wln20/cskv" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2409.10516"><strong>RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval.</strong></a> <em>Di Liu, Meng Chen, Baotong Lu, Huiqiang Jiang, Zhenhua Han, Qianxi Zhang, Qi Chen, Chengruidong Zhang, Bailu Ding, Kai Zhang, Chen Chen, Fan Yang, Yuqing Yang, Lili Qiu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2409.04992"><strong>InstInfer: In-Storage Attention Offloading for Cost-Effective Long-Context LLM Inference.</strong></a> <em>Xiurui Pan, Endian Li, Qiao Li, Shengwen Liang, Yizhou Shan, Ke Zhou, Yingwei Luo, Xiaolin Wang, Jie Zhang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2409.12490"><strong>CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs.</strong></a> <em>Junlin Lv, Yuan Feng, Xike Xie, Xin Jia, Qirong Peng, Guiming Xie.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2409.17422"><strong>Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction.</strong></a> <em>Zhenmei Shi, Yifei Ming, Xuan-Phi Nguyen, Yingyu Liang, Shafiq Joty.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/SalesforceAIResearch/GemFilter"><img src="https://img.shields.io/github/stars/SalesforceAIResearch/GemFilter" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2409.15012"><strong>Inference-Friendly Models With MixAttention.</strong></a> <em>Shashank Rajput, Ying Sheng, Sean Owen, Vitaliy Chiley.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.00161"><strong>KV-Compress: Paged KV-Cache Compression with Variable Compression Rates per Attention Head.</strong></a> <em>Isaac Rehg.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/IsaacRe/vllm-kvcompress"><img src="https://img.shields.io/github/stars/IsaacRe/vllm-kvcompress" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2410.01805"><strong>Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads.</strong></a> <em>Yuxiang Huang, Binhang Yuan, Xu Han, Chaojun Xiao, Zhiyuan Liu.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/huangyuxiang03/Locret"><img src="https://img.shields.io/github/stars/huangyuxiang03/Locret" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2410.01518"><strong>InfiniPot: Infinite Context Processing on Memory-Constrained LLMs.</strong></a> <em>Minsoo Kim, Kyuhong Shim, Jungwook Choi, Simyung Chang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.03090"><strong>UNComp: Uncertainty-Aware Long-Context Compressor for Efficient Large Language Model Inference.</strong></a> <em>Jing Xiong, Jianghan Shen, Fanghua Ye, Chaofan Tao, Zhongwei Wan, Jianqiao Lu, Xun Wu, Chuanyang Zheng, Zhijiang Guo, Lingpeng Kong, Ngai Wong.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.03111"><strong>LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy.</strong></a> <em>Rongzhi Zhang, Kuang Wang, Liyuan Liu, Shuohang Wang, Hao Cheng, Chao Zhang, Yelong Shen.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.10819"><strong>DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads.</strong></a> <em>Guangxuan Xiao, Jiaming Tang, Jingwei Zuo, Junxian Guo, Shang Yang, Haotian Tang, Yao Fu, Song Han.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/mit-han-lab/duo-attention"><img src="https://img.shields.io/github/stars/mit-han-lab/duo-attention" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2410.12876"><strong>In-context KV-Cache Eviction for LLMs via Attention-Gate.</strong></a> <em>Zihao Zeng, Bokai Lin, Tianqi Hou, Hao Zhang, Zhijie Deng.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.13846"><strong>SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.</strong></a> <em>Xuan Zhang, Cunxiao Du, Chao Du, Tianyu Pang, Wei Gao, Min Lin.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/sail-sg/SimLayerKV"><img src="https://img.shields.io/github/stars/sail-sg/SimLayerKV" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2410.14442"><strong>A Systematic Study of Cross-Layer KV Sharing for Efficient LLM Inference.</strong></a> <em>You Wu, Haoyi Wu, Kewei Tu.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/whyNLP/LCKV"><img src="https://img.shields.io/github/stars/whyNLP/LCKV" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2410.18517"><strong>KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing.</strong></a> <em>Yifei Yang, Zouying Cao, Qiguang Chen, Libo Qin, Dongjie Yang, Hai Zhao, Zhi Chen.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/yangyifei729/KVSharer"><img src="https://img.shields.io/github/stars/yangyifei729/KVSharer" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2410.15252"><strong>Lossless KV Cache Compression to 2%.</strong></a> <em>Zhen Yang, J.N.Han, Kan Wu, Ruobing Xie, An Wang, Xingwu Sun, Zhanhui Kang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.14731"><strong>MatryoshkaKV: Adaptive KV Compression via Trainable Orthogonal Projection.</strong></a> <em>Bokai Lin, Zihao Zeng, Zipeng Xiao, Siqi Kou, Tianqi Hou, Xiaofeng Gao, Hao Zhang, Zhijie Deng.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.15332"><strong>EPIC: Efficient Position-Independent Context Caching for Serving Large Language Models.</strong></a> <em>Junhao Hu, Wenrui Huang, Haoyi Wang, Weidong Wang, Tiancheng Hu, Qin Zhang, Hao Feng, Xusheng Chen, Yizhou Shan, Tao Xie.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.16179"><strong>MagicPIG: LSH Sampling for Efficient LLM Generation.</strong></a> <em>Zhuoming Chen, Ranajoy Sadhukhan, Zihao Ye, Yang Zhou, Jianyu Zhang, Niklas Nolte, Yuandong Tian, Matthijs Douze, Leon Bottou, Zhihao Jia, Beidi Chen.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/Infini-AI-Lab/MagicPIG"><img src="https://img.shields.io/github/stars/Infini-AI-Lab/MagicPIG" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2410.19258"><strong>Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning.</strong></a> <em>Yu Fu, Zefan Cai, Abedelkadir Asi, Wayne Xiong, Yue Dong, Wen Xiao.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.20926"><strong>Long Sequence Modeling with Attention Tensorization: From Sequence to Tensor Learning.</strong></a> <em>Aosong Feng, Rex Ying, Leandros Tassiulas.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.21465"><strong>ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference.</strong></a> <em>Hanshi Sun, Li-Wen Chang, Wenlei Bao, Size Zheng, Ningxin Zheng, Xin Liu, Harry Dong, Yuejie Chi, Beidi Chen.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/bytedance/ShadowKV"><img src="https://img.shields.io/github/stars/bytedance/ShadowKV" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2410.23079"><strong>BUZZ: Beehive-structured Sparse KV Cache with Segmented Heavy Hitters for Efficient LLM Inference.</strong></a> <em>Junqi Zhao, Zhijin Fang, Shu Li, Shaohui Yang, Shichao He.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/JunqiZhao888/buzz-llm"><img src="https://img.shields.io/github/stars/JunqiZhao888/buzz-llm" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2410.23317"><strong>VL-Cache: Sparsity and Modality-Aware KV Cache Compression for Vision-Language Model Inference Acceleration.</strong></a> <em>Dezhan Tu, Danylo Vashchilenko, Yuzhe Lu, Panpan Xu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2411.02886"><strong>TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection.</strong></a> <em>Wei Wu, Zhuoshi Pan, Chao Wang, Liyi Chen, Yunchu Bai, Kun Fu, Zheng Wang, Hui Xiong.</em> Arxiv 2024.</p>
  </li>
</ol>

<h2 id="3-recurrent-transformers">3. Recurrent Transformers</h2>

<ol>
  <li><a href="https://arxiv.org/abs/1901.02860"><strong>Transformer-XL: Attentive language models beyond a fixed-length context.</strong></a> <em>Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov.</em> ACL 2019.</li>
</ol>

<p>        <a href="https://github.com/kimiyoung/transformer-xl"><img src="https://img.shields.io/github/stars/kimiyoung/transformer-xl" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/1911.05507"><strong>Compressive Transformers for Long-Range Sequence Modelling.</strong></a> <em>Jack W. Rae, Anna Potapenko, Siddhant M. Jayakumar, Timothy P. Lillicrap.</em> Arxiv 2019.</li>
</ol>

<p>        <a href="https://github.com/lucidrains/compressive-transformer-pytorch"><img src="https://img.shields.io/github/stars/lucidrains/compressive-transformer-pytorch" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2010.06891"><strong>Memformer: The memory-augmented transformer.</strong></a> <em>Qingyang Wu, Zhenzhong Lan, Kun Qian, Jing Gu, Alborz Geramifard, Zhou Yu.</em> Arxiv 2020.</li>
</ol>

<p>        <a href="https://github.com/lucidrains/memformer"><img src="https://img.shields.io/github/stars/lucidrains/memformer" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://aclanthology.org/2021.acl-long.227/"><strong>ERNIE-Doc: A Retrospective Long-Document Modeling Transformer.</strong></a> <em>SiYu Ding, Junyuan Shang, Shuohuan Wang, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang.</em> ACL-IJCNLP 2021.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2203.08913"><strong>Memorizing Transformers.</strong></a> <em>Yuhuai Wu, Markus N. Rabe, DeLesley Hutchins, Christian Szegedy.</em> Arxiv 2022.</p>
  </li>
</ol>

<p>        <a href="https://github.com/lucidrains/memorizing-transformers-pytorch"><img src="https://img.shields.io/github/stars/lucidrains/memorizing-transformers-pytorch" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://aclanthology.org/2023.findings-acl.188/"><strong>Recurrent Attention Networks for Long-text Modeling.</strong></a> <em>Xianming Li, Zongxi Li, Xiaotian Luo, Haoran Xie, Xing Lee, Yingbin Zhao, Fu Lee Wang, Qing Li.</em> ACL 2023.</li>
</ol>

<p>        <a href="https://github.com/4ai/ran"><img src="https://img.shields.io/github/stars/4ai/ran" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2305.13048"><strong>RWKV: Reinventing RNNs for the Transformer Era.</strong></a> <em>Bo Peng, Eric Alcaide, Quentin Anthony, Alon Albalak, Samuel Arcadinho, Huanqi Cao, Xin Cheng, Michael Chung, Matteo Grella, Kranthi Kiran GV, Xuzheng He, Haowen Hou, Przemyslaw Kazienko, Jan Kocon, Jiaming Kong, Bartlomiej Koptyra, Hayden Lau, Krishna Sri Ipsit Mantri, Ferdinand Mom, Atsushi Saito, Xiangru Tang, Bolun Wang, Johan S. Wind, Stansilaw Wozniak, Ruichong Zhang, Zhenyuan Zhang, Qihang Zhao, Peng Zhou, Jian Zhu, Rui-Jie Zhu.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/BlinkDL/RWKV-LM"><img src="https://img.shields.io/github/stars/BlinkDL/RWKV-LM" alt="GitHub Repo stars" /></a>
        <a href="https://github.com/BlinkDL/ChatRWKV"><img src="https://img.shields.io/github/stars/BlinkDL/ChatRWKV" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2305.16340"><strong>Segmented Recurrent Transformer: An Efficient Sequence-to-Sequence Model.</strong></a> <em>Yinghan Long, Sayeed Shafayet Chowdhury, Kaushik Roy.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2304.11062"><strong>Scaling Transformer to 1M tokens and beyond with RMT.</strong></a> <em>Aydar Bulatov, Yuri Kuratov, Mikhail S. Burtsev.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2203.07852"><strong>Block-Recurrent Transformers.</strong></a> <em>DeLesley Hutchins, Imanol Schlag, Yuhuai Wu, Ethan Dyer, Behnam Neyshabur.</em> Arxiv 2023.</p>
  </li>
</ol>

<p>        <a href="https://github.com/lucidrains/block-recurrent-transformer-pytorch"><img src="https://img.shields.io/github/stars/lucidrains/block-recurrent-transformer-pytorch" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2310.15494"><strong>TRAMS: Training-free Memory Selection for Long-range Language Modeling.</strong></a> <em>Haofei Yu, Cunxiang Wang, Yue Zhang, Wei Bi.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/lwaekfjlk/TRAMS"><img src="https://img.shields.io/github/stars/lwaekfjlk/TRAMS" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2402.19427"><strong>Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models.</strong></a> <em>Soham De, Samuel L. Smith, Anushan Fernando, Aleksandar Botev, George Cristian-Muraru, Albert Gu, Ruba Haroun, Leonard Berrada, Yutian Chen, Srivatsan Srinivasan, Guillaume Desjardins, Arnaud Doucet, David Budden, Yee Whye Teh, Razvan Pascanu, Nando De Freitas, Caglar Gulcehre.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2402.11577"><strong>Extensible Embedding: A Flexible Multipler For LLM’s Context Length.</strong></a> <em>Ninglu Shao, Shitao Xiao, Zheng Liu, Peitian Zhang.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/FlagOpen/FlagEmbedding"><img src="https://img.shields.io/github/stars/FlagOpen/FlagEmbedding" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2404.05892"><strong>Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence.</strong></a> <em>Bo Peng, Daniel Goldstein, Quentin Anthony, Alon Albalak, Eric Alcaide, Stella Biderman, Eugene Cheah, Teddy Ferdinan, Haowen Hou, Przemysław Kazienko, Kranthi Kiran GV, Jan Kocoń, Bartłomiej Koptyra, Satyapriya Krishna, Ronald McClelland Jr., Niklas Muennighoff, Fares Obeid, Atsushi Saito, Guangyu Song, Haoqin Tu, Stanisław Woźniak, Ruichong Zhang, Bingchen Zhao, Qihang Zhao, Peng Zhou, Jian Zhu, Rui-Jie Zhu.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/RWKV/RWKV-LM"><img src="https://img.shields.io/github/stars/RWKV/RWKV-LM" alt="GitHub Repo stars" /></a>
        <a href="https://github.com/RWKV/ChatRWKV"><img src="https://img.shields.io/github/stars/RWKV/ChatRWKV" alt="GitHub Repo stars" /></a>
        <a href="https://github.com/RWKV/RWKV-infctx-trainer"><img src="https://img.shields.io/github/stars/RWKV/RWKV-infctx-trainer" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2404.07143"><strong>Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention.</strong></a> <em>Tsendsuren Munkhdalai, Manaal Faruqui, Siddharth Gopal.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2404.07839"><strong>RecurrentGemma: Moving Past Transformers for Efficient Open Language Models.</strong></a> <em>Aleksandar Botev, Soham De, Samuel L Smith, Anushan Fernando, George-Cristian Muraru, Ruba Haroun, Leonard Berrada, Razvan Pascanu, Pier Giuseppe Sessa, Robert Dadashi, Léonard Hussenot, Johan Ferret, Sertan Girgin, Olivier Bachem, Alek Andreev, Kathleen Kenealy, Thomas Mesnard, Cassidy Hardin, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love, Pouya Tafti, Armand Joulin, Noah Fiedel, Evan Senter, Yutian Chen, Srivatsan Srinivasan, Guillaume Desjardins, David Budden, Arnaud Doucet, Sharad Vikram, Adam Paszke, Trevor Gale, Sebastian Borgeaud, Charlie Chen, Andy Brock, Antonia Paterson, Jenny Brennan, Meg Risdal, Raj Gundluru, Nesh Devanathan, Paul Mooney, Nilay Chauhan, Phil Culliton, Luiz GUStavo Martins, Elisa Bandy, David Huntsperger, Glenn Cameron, Arthur Zucker, Tris Warkentin, Ludovic Peran, Minh Giang, Zoubin Ghahramani, Clément Farabet, Koray Kavukcuoglu, Demis Hassabis, Raia Hadsell, Yee Whye Teh, Nando de Frietas.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.06640"><strong>Linearizing Large Language Models.</strong></a> <em>Jean Mercat, Igor Vasiljevic, Sedrick Keh, Kushal Arora, Achal Dave, Adrien Gaidon, Thomas Kollar.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/TRI-ML/linear_open_lm"><img src="https://img.shields.io/github/stars/TRI-ML/linear_open_lm" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.13362"><strong>VisualRWKV: Exploring Recurrent Neural Networks for Visual Language Models.</strong></a> <em>Haowen Hou, Peigen Zeng, Fei Ma, Fei Richard Yu.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/howard-hou/VisualRWKV"><img src="https://img.shields.io/github/stars/howard-hou/VisualRWKV" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2407.05483"><strong>Just read twice: closing the recall gap for recurrent language models.</strong></a> <em>Simran Arora, Aman Timalsina, Aaryan Singhal, Benjamin Spector, Sabri Eyuboglu, Xinyi Zhao, Ashish Rao, Atri Rudra, Christopher Ré.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/HazyResearch/prefix-linear-attention"><img src="https://img.shields.io/github/stars/HazyResearch/prefix-linear-attention" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2407.04841"><strong>Associative Recurrent Memory Transformer.</strong></a> <em>Ivan Rodkin, Yuri Kuratov, Aydar Bulatov, Mikhail Burtsev.</em> ICML 2024 Workshop.</li>
</ol>

<p>        <a href="https://github.com/RodkinIvan/associative-recurrent-memory-transformer"><img src="https://img.shields.io/github/stars/RodkinIvan/associative-recurrent-memory-transformer" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2407.12077"><strong>GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill and Extreme KV-Cache Compression.</strong></a> <em>Daniel Goldstein, Fares Obeid, Eric Alcaide, Guangyu Song, Eugene Cheah.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/recursal/GoldFinch-paper"><img src="https://img.shields.io/github/stars/recursal/GoldFinch-paper" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2408.03062"><strong>Analysis of Argument Structure Constructions in a Deep Recurrent Language Model.</strong></a> <em>Pegah Ramezani, Achim Schilling, Patrick Krauss.</em> Arxiv 2024.</li>
</ol>

<h2 id="4-state-space-models">4. State Space Models</h2>

<ol>
  <li><a href="https://arxiv.org/abs/2312.00752"><strong>Mamba: Linear-Time Sequence Modeling with Selective State Spaces.</strong></a> <em>Albert Gu, Tri Dao.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/state-spaces/mamba"><img src="https://img.shields.io/github/stars/state-spaces/mamba" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2401.04081"><strong>MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts.</strong></a> <em>Maciej Pióro, Kamil Ciebiera, Krystian Król, Jan Ludziejewski, Sebastian Jaszczur.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2401.13660"><strong>MambaByte: Token-free Selective State Space Model.</strong></a> <em>Junxiong Wang, Tushaar Gangavarapu, Jing Nathan Yan, Alexander M Rush.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2401.17919"><strong>LOCOST: State-Space Models for Long Document Abstractive Summarization.</strong></a> <em>Florian Le Bronnec, Song Duong, Mathieu Ravaut, Alexandre Allauzen, Nancy F. Chen, Vincent Guigue, Alberto Lumbreras, Laure Soulier, Patrick Gallinari.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2403.16899"><strong>State Space Models as Foundation Models: A Control Theoretic Overview.</strong></a> <em>Carmen Amo Alonso, Jerome Sieber, Melanie N. Zeilinger.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2403.19887"><strong>Jamba: A Hybrid Transformer-Mamba Language Model.</strong></a> <em>Opher Lieber, Barak Lenz, Hofit Bata, Gal Cohen, Jhonathan Osin, Itay Dalmedigos, Erez Safahi, Shaked Meirom, Yonatan Belinkov, Shai Shalev-Shwartz, Omri Abend, Raz Alon, Tomer Asida, Amir Bergman, Roman Glozman, Michael Gokhman, Avashalom Manevich, Nir Ratner, Noam Rozen, Erez Shwartz, Mor Zusman, Yoav Shoham.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://openreview.net/forum?id=DjeQ39QoLQ"><strong>Robustifying State-space Models for Long Sequences via Approximate Diagonalization.</strong></a> <em>Annan Yu, Arnur Nigmetov, Dmitriy Morozov, Michael W. Mahoney, N. Benjamin Erichson.</em> ICLR 2024 Spotlight.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.16712"><strong>Zamba: A Compact 7B SSM Hybrid Model.</strong></a> <em>Paolo Glorioso, Quentin Anthony, Yury Tokpanov, James Whittington, Jonathan Pilault, Adam Ibrahim, Beren Millidge.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.21060"><strong>Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality.</strong></a> <em>Tri Dao, Albert Gu.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/state-spaces/mamba"><img src="https://img.shields.io/github/stars/state-spaces/mamba" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.07522"><strong>Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling.</strong></a> <em>Liliang Ren, Yang Liu, Yadong Lu, Yelong Shen, Chen Liang, Weizhu Chen.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/microsoft/Samba"><img src="https://img.shields.io/github/stars/microsoft/Samba" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.07887"><strong>An Empirical Study of Mamba-based Language Models.</strong></a> <em>Roger Waleffe, Wonmin Byeon, Duncan Riach, Brandon Norick, Vijay Korthikanti, Tri Dao, Albert Gu, Ali Hatamizadeh, Sudhakar Singh, Deepak Narayanan, Garvit Kulshreshtha, Vartika Singh, Jared Casper, Jan Kautz, Mohammad Shoeybi, Bryan Catanzaro.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/NVIDIA/Megatron-LM/tree/ssm/examples/mamba"><img src="https://img.shields.io/github/stars/NVIDIA/Megatron-LM" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2407.06324"><strong>B’MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory.</strong></a> <em>Luca Zancato, Arjun Seshadri, Yonatan Dukler, Aditya Golatkar, Yantao Shen, Benjamin Bowman, Matthew Trager, Alessandro Achille, Stefano Soatto.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.10347"><strong>MambaForGCN: Enhancing Long-Range Dependency with State Space Model and Kolmogorov-Arnold Networks for Aspect-Based Sentiment Analysis.</strong></a> <em>Adamu Lawan, Juhua Pu, Haruna Yunusa, Aliyu Umar, Muhammad Lawan.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.10998"><strong>Discrete Diffusion Language Model for Long Text Summarization.</strong></a> <em>Do Huu Dat, Do Duc Anh, Anh Tuan Luu, Wray Buntine.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.19832"><strong>ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2.</strong></a> <em>Wenjun Huang, Jianguo Hu.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/WenjunHuang94/ML-Mamba"><img src="https://img.shields.io/github/stars/WenjunHuang94/ML-Mamba" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2408.12570"><strong>Jamba-1.5: Hybrid Transformer-Mamba Models at Scale.</strong></a> <em>Jamba Team: Barak Lenz, Alan Arazi, Amir Bergman, Avshalom Manevich, Barak Peleg, Ben Aviram, Chen Almagor, Clara Fridman, Dan Padnos, Daniel Gissin, Daniel Jannai, Dor Muhlgay, Dor Zimberg, Edden M Gerber, Elad Dolev, Eran Krakovsky, Erez Safahi, Erez Schwartz, Gal Cohen, Gal Shachaf, Haim Rozenblum, Hofit Bata, Ido Blass, Inbal Magar, Itay Dalmedigos, Jhonathan Osin, Julie Fadlon, Maria Rozman, Matan Danos, Michael Gokhman, Mor Zusman, Naama Gidron, Nir Ratner, Noam Gat, Noam Rozen, Oded Fried, Ohad Leshno, Omer Antverg, Omri Abend, Opher Lieber, Or Dagan, Orit Cohavi, Raz Alon, Ro’i Belson, Roi Cohen, Rom Gilad, Roman Glozman, Shahar Lev, Shaked Meirom, Tal Delbari, Tal Ness, Tomer Asida, Tom Ben Gal, Tom Braude, Uriya Pumerantz, Yehoshua Cohen, Yonatan Belinkov, Yuval Globerson, Yuval Peleg Levy, Yoav Shoham.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2408.14909"><strong>SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models.</strong></a> <em>Shuaijie Shen, Chao Wang, Renzhuo Huang, Yan Zhong, Qinghai Guo, Zhichao Lu, Jianguo Zhang, Luziwei Leng.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2408.15496"><strong>ReMamba: Equip Mamba with Effective Long-Sequence Modeling.</strong></a> <em>Danlong Yuan, Jiahao Liu, Bei Li, Huishuai Zhang, Jingang Wang, Xunliang Cai, Dongyan Zhao.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.07145"><strong>Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling.</strong></a> <em>Yingfa Chen, Xinrong Zhang, Shengding Hu, Xu Han, Zhiyuan Liu, Maosong Sun.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/thunlp/stuffed-mamba"><img src="https://img.shields.io/github/stars/thunlp/stuffed-mamba" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2410.18572"><strong>Taipan: Efficient and Expressive State Space Language Models with Selective Attention.</strong></a> <em>Chien Van Nguyen, Huy Huu Nguyen, Thang M. Pham, Ruiyi Zhang, Hanieh Deilamsalehy, Puneet Mathur, Ryan A. Rossi, Trung Bui, Viet Dac Lai, Franck Dernoncourt, Thien Huu Nguyen.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.14725"><strong>Rethinking Token Reduction for State Space Models.</strong></a> <em>Zheng Zhan, Yushu Wu, Zhenglun Kong, Changdi Yang, Yifan Gong, Xuan Shen, Xue Lin, Pu Zhao, Yanzhi Wang.</em> EMNLP 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/wuyushuwys/ToR_SSM"><img src="https://img.shields.io/github/stars/wuyushuwys/ToR_SSM" alt="GitHub Repo stars" /></a></p>

<h2 id="5-length-extrapolation">5. Length Extrapolation</h2>

<ol>
  <li><a href="https://arxiv.org/abs/2104.09864"><strong>RoFormer: Enhanced Transformer with Rotary Position Embedding.</strong></a> <em>Jianlin Su, Yu Lu, Shengfeng Pan, Ahmed Murtadha, Bo Wen, Yunfeng Liu.</em> Arxiv 2021.</li>
</ol>

<p>        <a href="https://github.com/ZhuiyiTechnology/roformer"><img src="https://img.shields.io/github/stars/ZhuiyiTechnology/roformer" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2108.12409"><strong>Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation.</strong></a> <em>Ofir Press, Noah A. Smith, Mike Lewis.</em> ICLR 2022.</li>
</ol>

<p>        <a href="https://github.com/ofirpress/attention_with_linear_biases"><img src="https://img.shields.io/github/stars/ofirpress/attention_with_linear_biases" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2205.09921"><strong>KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation.</strong></a> <em>Ta-Chung Chi, Ting-Han Fan, Peter J. Ramadge, Alexander I. Rudnicky.</em> Arxiv 2022.</p>
  </li>
  <li>
    <p><a href="https://aclanthology.org/2023.acl-long.756/"><strong>Dissecting Transformer Length Extrapolation via the Lens of Receptive Field Analysis.</strong></a> <em>Ta-Chung Chi, Ting-Han Fan, Alexander I. Rudnicky, Peter J. Ramadge.</em> ACL 2023.</p>
  </li>
  <li>
    <p><a href="https://aclanthology.org/2023.acl-long.816/"><strong>A Length-Extrapolatable Transformer.</strong></a> <em>Yutao Sun, Li Dong, Barun Patra, Shuming Ma, Shaohan Huang, Alon Benhaim, Vishrav Chaudhary, Xia Song, Furu Wei.</em> ACL 2023.</p>
  </li>
</ol>

<p>        <a href="https://github.com/sunyt32/torchscale"><img src="https://img.shields.io/github/stars/sunyt32/torchscale" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://aclanthology.org/2023.acl-short.161/"><strong>Randomized Positional Encodings Boost Length Generalization of Transformers.</strong></a> <em>Anian Ruoss, Grégoire Delétang, Tim Genewein, Jordi Grau-Moya, Róbert Csordás, Mehdi Bennani, Shane Legg, Joel Veness.</em> ACL 2023.</li>
</ol>

<p>        <a href="https://github.com/google-deepmind/randomized_positional_encodings"><img src="https://img.shields.io/github/stars/google-deepmind/randomized_positional_encodings" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2305.19466"><strong>The Impact of Positional Encoding on Length Generalization in Transformers.</strong></a> <em>Amirhossein Kazemnejad, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Payel Das, Siva Reddy.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/McGill-NLP/length-generalization"><img src="https://img.shields.io/github/stars/McGill-NLP/length-generalization" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2307.03170"><strong>Focused Transformer: Contrastive Training for Context Scaling.</strong></a> <em>Szymon Tworkowski, Konrad Staniszewski, Mikołaj Pacek, Yuhuai Wu, Henryk Michalewski, Piotr Miłoś.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/CStanKonrad/long_llama"><img src="https://img.shields.io/github/stars/CStanKonrad/long_llama" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2306.15595"><strong>Extending Context Window of Large Language Models via Positional Interpolation.</strong></a> <em>Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2307.10156"><strong>Exploring Transformer Extrapolation.</strong></a> <em>Zhen Qin, Yiran Zhong, Hui Deng.</em> Arxiv 2023.</p>
  </li>
</ol>

<p>        <a href="https://github.com/OpenNLPLab/Rpe"><img src="https://img.shields.io/github/stars/OpenNLPLab/Rpe" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/pdf/2308.16137.pdf"><strong>LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models.</strong></a> <em>Chi Han, Qifan Wang, Wenhan Xiong, Yu Chen, Heng Ji, Sinong Wang.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/kyegomez/LM-Infinite"><img src="https://img.shields.io/github/stars/kyegomez/LM-Infinite" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2309.00071"><strong>YaRN: Efficient Context Window Extension of Large Language Models.</strong></a> <em>Bowen Peng, Jeffrey Quesnelle, Honglu Fan, Enrico Shippole.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/jquesnelle/yarn"><img src="https://img.shields.io/github/stars/jquesnelle/yarn" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2309.10400"><strong>PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training.</strong></a> <em>Dawei Zhu,Nan Yang,Liang Wang,Yifan Song,Wenhao Wu,Furu Wei,Sujian Li.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/dwzhu-pku/PoSE"><img src="https://img.shields.io/github/stars/dwzhu-pku/PoSE" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2309.12307"><strong>LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models.</strong></a> <em>Yukang Chen, Shengju Qian, Haotian Tang, Xin Lai, Zhijian Liu, Song Han, Jiaya Jia.</em> ICLR 2024 Oral.</li>
</ol>

<p>        <a href="https://github.com/dvlab-research/LongLoRA"><img src="https://img.shields.io/github/stars/dvlab-research/LongLoRA" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2310.05209"><strong>Scaling Laws of RoPE-based Extrapolation.</strong></a> <em>Xiaoran Liu, Hang Yan, Shuo Zhang, Chenxin An, Xipeng Qiu, Dahua Lin.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/pdf/2311.00684v1.pdf"><strong>Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation.</strong></a> <em>Ta-Chung Chi,Ting-Han Fan,Alexander I. Rudnicky.</em> Arxiv 2023.</p>
  </li>
</ol>

<p>        <a href="https://github.com/chijames/Attention-Alignment-Transformer-Length-Extrapolation"><img src="https://img.shields.io/github/stars/chijames/Attention-Alignment-Transformer-Length-Extrapolation" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2309.08646"><strong>CoCA: Fusing position embedding with Collinear Constrained Attention for fine-tuning free context window extending.</strong></a> <em>Shiyi Zhu, Jing Ye, Wei Jiang, Qi Zhang, Yifan Wu, Jianguo Li.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/codefuse-ai/Collinear-Constrained-Attention"><img src="https://img.shields.io/github/stars/codefuse-ai/Collinear-Constrained-Attention" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2312.17296"><strong>Structured Packing in LLM Training Improves Long Context Utilization.</strong></a> <em>Konrad Staniszewski, Szymon Tworkowski, Sebastian Jaszczur, Henryk Michalewski, Łukasz Kuciński, Piotr Miłoś.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2401.01325v1"><strong>LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning.</strong></a> <em>Hongye Jin, Xiaotian Han, Jingfeng Yang, Zhimeng Jiang, Zirui Liu, Chia-Yuan Chang, Huiyuan Chen, Xia Hu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2401.02669"><strong>Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache.</strong></a> <em>Bin Lin, Tao Peng, Chen Zhang, Minmin Sun, Lanbo Li, Hanyu Zhao, Wencong Xiao, Qi Xu, Xiafei Qiu, Shen Li, Zhigang Ji, Yong Li, Wei Lin.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2401.04695"><strong>Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models.</strong></a> <em>Zhen Qin, Weigao Sun, Dong Li, Xuyang Shen, Weixuan Sun, Yiran Zhong.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/OpenNLPLab/lightning-attention"><img src="https://img.shields.io/github/stars/OpenNLPLab/lightning-attention" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2401.07004"><strong>Extending LLMs’ Context Window with 100 Samples.</strong></a> <em>Yikai Zhang, Junlong Li, Pengfei Liu.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/GAIR-NLP/Entropy-ABF"><img src="https://img.shields.io/github/stars/GAIR-NLP/Entropy-ABF" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2401.06951"><strong>E^2-LLM: Efficient and Extreme Length Extension of Large Language Models.</strong></a> <em>Jiaheng Liu, Zhiqi Bai, Yuanxing Zhang, Chenchen Zhang, Yu Zhang, Ge Zhang, Jiakai Wang, Haoran Que, Yukang Chen, Wenbo Su, Tiezheng Ge, Jie Fu, Wenhu Chen, Bo Zheng.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2401.11504"><strong>With Greater Text Comes Greater Necessity: Inference-Time Training Helps Long Text Generation.</strong></a> <em>Y. Wang, D. Ma, D. Cai.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/TemporaryLoRA/Temp-LoRA"><img src="https://img.shields.io/github/stars/TemporaryLoRA/Temp-LoRA" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2401.16421"><strong>Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation.</strong></a> <em>Zhenyu He, Guhao Feng, Shengjie Luo, Kai Yang, Di He, Jingjing Xu, Zhi Zhang, Hongxia Yang, Liwei Wang.</em> ICML 2024.</li>
</ol>

<p>        <a href="https://github.com/zhenyuhe00/BiPE"><img src="https://img.shields.io/github/stars/zhenyuhe00/BiPE" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2401.17377"><strong>Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens.</strong></a> <em>Jiacheng Liu, Sewon Min, Luke Zettlemoyer, Yejin Choi, Hannaneh Hajishirzi.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/liujch1998/infini-gram"><img src="https://img.shields.io/github/stars/liujch1998/infini-gram" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2402.13753"><strong>LongRoPE: Extending LLM ContextWindow Beyond 2 Million Tokens.</strong></a> <em>Yiran Ding, Li Lyna Zhang, Chengruidong Zhang, Yuanyuan Xu, Ning Shang, Jiahang Xu, Fan Yang, Mao Yang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2402.10171"><strong>Data Engineering for Scaling Language Models to 128K Context.</strong></a> <em>Yao Fu, Rameswar Panda, Xinyao Niu, Xiang Yue, Hannaneh Hajishirzi, Yoon Kim, Hao Peng.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/FranxYao/Long-Context-Data-Engineering"><img src="https://img.shields.io/github/stars/FranxYao/Long-Context-Data-Engineering" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2402.09371v1"><strong>Transformers Can Achieve Length Generalization But Not Robustly.</strong></a> <em>Yongchao Zhou, Uri Alon, Xinyun Chen, Xuezhi Wang, Rishabh Agarwal, Denny Zhou.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2402.16617"><strong>Long-Context Language Modeling with Parallel Context Encoding.</strong></a> <em>Howard Yen, Tianyu Gao, Danqi Chen.</em> ACL 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/princeton-nlp/CEPE"><img src="https://img.shields.io/github/stars/princeton-nlp/CEPE" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2310.16450"><strong>CLEX: Continuous Length Extrapolation for Large Language Models.</strong></a> <em>Guanzheng Chen, Xin Li, Zaiqiao Meng, Shangsong Liang, Lidong Bing.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/DAMO-NLP-SG/CLEX"><img src="https://img.shields.io/github/stars/DAMO-NLP-SG/CLEX" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2403.00071"><strong>Resonance RoPE: Improving Context Length Generalization of Large Language Models.</strong></a> <em>Suyuchen Wang, Ivan Kobyzev, Peng Lu, Mehdi Rezagholizadeh, Bang Liu.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/sheryc/resonance_rope"><img src="https://img.shields.io/github/stars/sheryc/resonance_rope" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2403.05004"><strong>Can’t Remember Details in Long Documents? You Need Some R&amp;R.</strong></a> <em>Devanshu Agrawal, Shang Gao, Martin Gajek.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/casetext/r-and-r"><img src="https://img.shields.io/github/stars/casetext/r-and-r" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2403.04797"><strong>Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding.</strong></a> <em>Zhenyu Zhang, Runjin Chen, Shiwei Liu, Zhewei Yao, Olatunji Ruwase, Beidi Chen, Xiaoxia Wu, Zhangyang Wang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/VITA-Group/Ms-PoE"><img src="https://img.shields.io/github/stars/VITA-Group/Ms-PoE" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2402.04617"><strong>InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory.</strong></a> <em>Chaojun Xiao, Pengle Zhang, Xu Han, Guangxuan Xiao, Yankai Lin, Zhengyan Zhang, Zhiyuan Liu, Song Han, Maosong Sun.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2403.17552"><strong>Naive Bayes-based Context Extension for Large Language Models.</strong></a> <em>Jianlin Su, Murtadha Ahmed, Wenbo, Luo Ao, Mingren Zhu, Yunfeng Liu.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/amurtadha/NBCE-master"><img src="https://img.shields.io/github/stars/amurtadha/NBCE-master" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2403.09054"><strong>Keyformer: KV Cache Reduction through Key Tokens Selection for Efficient Generative Inference.</strong></a> <em>Muhammad Adnan, Akhil Arunkumar, Gaurav Jain, Prashant J. Nair, Ilya Soloveychik, Purushotham Kamath.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://openreview.net/forum?id=LXVswInHOo"><strong>In-Context Pretraining: Language Modeling Beyond Document Boundaries.</strong></a> <em>Weijia Shi, Sewon Min, Maria Lomeli, Chunting Zhou, Margaret Li, Xi Victoria Lin, Noah A. Smith, Luke Zettlemoyer, Wen-tau Yih, Mike Lewis.</em> ICLR 2024 Spotlight.</p>
  </li>
</ol>

<p>        <a href="https://github.com/swj0419/in-context-pretraining"><img src="https://img.shields.io/github/stars/swj0419/in-context-pretraining" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2309.16039"><strong>Effective Long-Context Scaling of Foundation Models.</strong></a> <em>Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, Hao Ma.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2404.10830"><strong>Fewer Truncations Improve Language Modeling.</strong></a> <em>Hantian Ding, Zijian Wang, Giovanni Paolini, Varun Kumar, Anoop Deoras, Dan Roth, Stefano Soatto.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2404.12224"><strong>Length Generalization of Causal Transformers without Position Encoding.</strong></a> <em>Jie Wang, Tao Ji, Yuanbin Wu, Hang Yan, Tao Gui, Qi Zhang, Xuanjing Huang, Xiaoling Wang.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/AntNLP/nope_head_scale"><img src="https://img.shields.io/github/stars/AntNLP/nope_head_scale" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2404.19553"><strong>Extending Llama-3’s Context Ten-Fold Overnight.</strong></a> <em>Peitian Zhang, Ninglu Shao, Zheng Liu, Shitao Xiao, Hongjin Qian, Qiwei Ye, Zhicheng Dou.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/FlagOpen/FlagEmbedding"><img src="https://img.shields.io/github/stars/FlagOpen/FlagEmbedding" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2405.03939"><strong>Long Context Alignment with Short Instructions and Synthesized Positions.</strong></a> <em>Wenhao Wu, Yizhong Wang, Yao Fu, Xiang Yue, Dawei Zhu, Sujian Li.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/nightdessert/SkipAlign"><img src="https://img.shields.io/github/stars/nightdessert/SkipAlign" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2405.04517"><strong>xLSTM: Extended Long Short-Term Memory.</strong></a> <em>Maximilian Beck, Korbinian Pöppel, Markus Spanring, Andreas Auer, Oleksandra Prudnikova, Michael Kopp, Günter Klambauer, Johannes Brandstetter, Sepp Hochreiter.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.14722"><strong>DAPE: Data-Adaptive Positional Encoding for Length Extrapolation.</strong></a> <em>Chuanyang Zheng, Yihang Gao, Han Shi, Minbin Huang, Jingyao Li, Jing Xiong, Xiaozhe Ren, Michael Ng, Xin Jiang, Zhenguo Li, Yu Li.</em> NeurIPS 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/chuanyang-Zheng/DAPE"><img src="https://img.shields.io/github/stars/chuanyang-Zheng/DAPE" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2405.18719"><strong>Contextual Position Encoding: Learning to Count What’s Important.</strong></a> <em>Olga Golovneva, Tianlu Wang, Jason Weston, Sainbayar Sukhbaatar.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.19846"><strong>Quest: Query-centric Data Synthesis Approach for Long-context Scaling of Large Language Model.</strong></a> <em>Chaochen Gao, Xing Wu, Qi Fu, Songlin Hu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.20671"><strong>Position Coupling: Leveraging Task Structure for Improved Length Generalization of Transformers.</strong></a> <em>Hanseul Cho, Jaeyoung Cha, Pranjal Awasthi, Srinadh Bhojanapalli, Anupam Gupta, Chulhee Yun.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/HanseulJo/position-coupling"><img src="https://img.shields.io/github/stars/HanseulJo/position-coupling" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2406.00605"><strong>LongSkywork: A Training Recipe for Efficiently Extending Context Length in Large Language Models.</strong></a> <em>Liang Zhao, Tianwen Wei, Liang Zeng, Cheng Cheng, Liu Yang, Peng Cheng, Lijie Wang, Chenxia Li, Xuejie Wu, Bo Zhu, Yimeng Gan, Rui Hu, Shuicheng Yan, Han Fang, Yahui Zhou.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.01895"><strong>Explicitly Encoding Structural Symmetry is Key to Length Generalization in Arithmetic Tasks.</strong></a> <em>Mahdi Sabbaghi, George Pappas, Hamed Hassani, Surbhi Goel.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.07138"><strong>Never Miss A Beat: An Efficient Recipe for Context Window Extension of Large Language Models with Consistent “Middle” Enhancement.</strong></a> <em>Tong Wu, Yanpeng Zhao, Zilong Zheng.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.09897"><strong>3D-RPE: Enhancing Long-Context Modeling Through 3D Rotary Position Encoding.</strong></a> <em>Xindian Ma, Wenyuan Liu, Peng Zhang, Nan Xu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.19598"><strong>Mixture of In-Context Experts Enhance LLMs’ Long Context Awareness.</strong></a> <em>Hongzhan Lin, Ang Lv, Yuhan Chen, Chen Zhu, Yang Song, Hengshu Zhu, Rui Yan.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/p1nksnow/MoICE"><img src="https://img.shields.io/github/stars/p1nksnow/MoICE" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2407.09450"><strong>Human-like Episodic Memory for Infinite Context LLMs.</strong></a> <em>Zafeirios Fountas, Martin A Benfeghoul, Adnan Oomerjee, Fenia Christopoulou, Gerasimos Lampouras, Haitham Bou-Ammar, Jun Wang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.13739"><strong>Scaling Granite Code Models to 128K Context.</strong></a> <em>Matt Stallone, Vaibhav Saxena, Leonid Karlinsky, Bridget McGinn, Tim Bula, Mayank Mishra, Adriana Meza Soria, Gaoyuan Zhang, Aditya Prasad, Yikang Shen, Saptha Surendran, Shanmukha Guttula, Hima Patel, Parameswaran Selvam, Xuan-Hong Dang, Yan Koyfman, Atin Sood, Rogerio Feris, Nirmit Desai, David D. Cox, Ruchir Puri, Rameswar Panda.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/ibm-granite/granite-code-models"><img src="https://img.shields.io/github/stars/ibm-granite/granite-code-models" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2407.14482"><strong>ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities.</strong></a> <em>Peng Xu, Wei Ping, Xianchao Wu, Zihan Liu, Mohammad Shoeybi, Bryan Catanzaro.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2408.01866"><strong>Efficient Solutions For An Intriguing Failure of LLMs: Long Context Window Does Not Mean LLMs Can Analyze Long Sequences Flawlessly.</strong></a> <em>Peyman Hosseini, Ignacio Castro, Iacopo Ghinassi, Matthew Purver.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2408.11745"><strong>FocusLLM: Scaling LLM’s Context by Parallel Decoding.</strong></a> <em>Zhenyu Li, Yike Zhang, Tengyu Pan, Yutao Sun, Zhichao Duan, Junjie Fang, Rong Han, Zixuan Wang, Jianyong Wang.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/leezythu/FocusLLM"><img src="https://img.shields.io/github/stars/leezythu/FocusLLM" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2409.00509"><strong>LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models.</strong></a> <em>Zhiyuan Hu, Yuliang Liu, Jinman Zhao, Suyuchen Wang, Yan Wang, Wei Shen, Qing Gu, Anh Tuan Luu, See-Kiong Ng, Zhiwei Jiang, Bryan Hooi.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/zhiyuanhubj/LongRecipe"><img src="https://img.shields.io/github/stars/zhiyuanhubj/LongRecipe" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2409.06679"><strong>E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning.</strong></a> <em>Zihan Liao, Jun Wang, Hang Yu, Lingxiao Wei, Jianguo Li, Jun Wang, Wei Zhang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2409.04774"><strong>Untie the Knots: An Efficient Data Augmentation Strategy for Long-Context Pre-Training in Language Models.</strong></a> <em>Junfeng Tian, Da Zheng, Yang Cheng, Rui Wang, Colin Zhang, Debing Zhang.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/rgtjf/Untie-the-Knots"><img src="https://img.shields.io/github/stars/rgtjf/Untie-the-Knots" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2409.19745"><strong>PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead.</strong></a> <em>Tao Tan, Yining Qian, Ang Lv, Hongzhan Lin, Songhao Wu, Yongbo Wang, Feng Wang, Jingtong Wu, Xin Lu, Rui Yan.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/TTArch/PEAR-RAG"><img src="https://img.shields.io/github/stars/TTArch/PEAR-RAG" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2410.01651"><strong>Efficient Long-range Language Modeling with Self-supervised Causal Retrieval.</strong></a> <em>Xiang Hu, Zhihao Teng, Wei Wu, Kewei Tu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.01485"><strong>A Little Goes a Long Way: Efficient Long Context Training and Inference with Partial Contexts.</strong></a> <em>Suyu Ge, Xihui Lin, Yunan Zhang, Jiawei Han, Hao Peng.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.01490"><strong>Extending Context Window of Large Language Models from a Distributional Perspective.</strong></a> <em>Yingsheng Wu, Yuxuan Gu, Xiaocheng Feng, Weihong Zhong, Dongliang Xu, Qing Yang, Hongtao Liu, Bing Qin.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/1180301012/DPRoPE"><img src="https://img.shields.io/github/stars/1180301012/DPRoPE" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2410.02660"><strong>How to Train Long-Context Language Models (Effectively).</strong></a> <em>Tianyu Gao, Alexander Wettig, Howard Yen, Danqi Chen.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/princeton-nlp/ProLong"><img src="https://img.shields.io/github/stars/princeton-nlp/ProLong" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2410.05258"><strong>Differential Transformer.</strong></a> <em>Tianzhu Ye, Li Dong, Yuqing Xia, Yutao Sun, Yi Zhu, Gao Huang, Furu Wei.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.04798"><strong>DAPE V2: Process Attention Score as Feature Map for Length Extrapolation.</strong></a> <em>Chuanyang Zheng, Yihang Gao, Han Shi, Jing Xiong, Jiankai Sun, Jingyao Li, Minbin Huang, Xiaozhe Ren, Michael Ng, Xin Jiang, Zhenguo Li, Yu Li.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/chuanyang-Zheng/DAPE"><img src="https://img.shields.io/github/stars/chuanyang-Zheng/DAPE" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2410.18745"><strong>Why Does the Effective Context Length of LLMs Fall Short?.</strong></a> <em>Chenxin An, Jun Zhang, Ming Zhong, Lei Li, Shansan Gong, Yao Luo, Jingjing Xu, Lingpeng Kong.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.18533"><strong>LOGO – Long cOntext aliGnment via efficient preference Optimization.</strong></a> <em>Zecheng Tang, Zechen Sun, Juntao Li, Qiaoming Zhu, Min Zhang.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/ZetangForward/LCM_Stack"><img src="https://img.shields.io/github/stars/ZetangForward/LCM_Stack" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2410.15633"><strong>Selecting Influential Samples for Long Context Alignment via Homologous Models’ Guidance and Contextual Awareness Measurement.</strong></a> <em>Shuzheng Si, Haozhe Zhao, Gang Chen, Yunshui Li, Kangyang Luo, Chuancheng Lv, Kaikai An, Fanchao Qi, Baobao Chang, Maosong Sun.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.19318"><strong>Two are better than one: Context window extension with multi-grained self-injection.</strong></a> <em>Wei Han, Pan Zhou, Soujanya Poria, Shuicheng Yan.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/Clement25/SharedLLM"><img src="https://img.shields.io/github/stars/Clement25/SharedLLM" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2410.21252"><strong>LongReward: Improving Long-context Large Language Models with AI Feedback.</strong></a> <em>Jiajie Zhang, Zhongni Hou, Xin Lv, Shulin Cao, Zhenyu Hou, Yilin Niu, Lei Hou, Yuxiao Dong, Ling Feng, Juanzi Li.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/THUDM/LongReward"><img src="https://img.shields.io/github/stars/THUDM/LongReward" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2410.21216"><strong>HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation.</strong></a> <em>Yuhan Chen, Ang Lv, Jian Luan, Bin Wang, Wei Liu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.23771"><strong>What is Wrong with Perplexity for Long-context Language Modeling?.</strong></a> <em>Lizhe Fang, Yifei Wang, Zhaoyang Liu, Chenheng Zhang, Stefanie Jegelka, Jinyang Gao, Bolin Ding, Yisen Wang.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/PKU-ML/LongPPL"><img src="https://img.shields.io/github/stars/PKU-ML/LongPPL" alt="GitHub Repo stars" /></a></p>

<h2 id="6-long-term-memory">6. Long Term Memory</h2>

<ol>
  <li><a href="https://arxiv.org/abs/2304.13343"><strong>Unleashing Infinite-Length Input Capacity for Large-scale Language Models with Self-Controlled Memory System.</strong></a> <em>Xinnian Liang, Bing Wang, Hui Huang, Shuangzhi Wu, Peihao Wu, Lu Lu, Zejun Ma, Zhoujun Li.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/wbbeyourself/SCM4LLMs"><img src="https://img.shields.io/github/stars/wbbeyourself/SCM4LLMs" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2305.10250"><strong>MemoryBank: Enhancing Large Language Models with Long-Term Memory.</strong></a> <em>Wanjun Zhong, Lianghong Guo, Qiqi Gao, He Ye, Yanlin Wang.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/zhongwanjun/MemoryBank-SiliconFriend"><img src="https://img.shields.io/github/stars/zhongwanjun/MemoryBank-SiliconFriend" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2307.11462"><strong>Improve Long-term Memory Learning Through Rescaling the Error Temporally.</strong></a> <em>Shida Wang, Zhanglu Yan.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2308.15022"><strong>Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models.</strong></a> <em>Qingyue Wang, Liang Ding, Yanan Cao, Zhiliang Tian, Shi Wang, Dacheng Tao, Li Guo.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2312.17259"><strong>Empowering Working Memory for Large Language Model Agents.</strong></a> <em>Jing Guo, Nan Li, Jianchuan Qi, Hang Yang, Ruiqiao Li, Yuzhen Feng, Si Zhang, Ming Xu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2312.17257"><strong>Evolving Large Language Model Assistant with Long-Term Conditional Memory.</strong></a> <em>Ruifeng Yuan, Shichao Sun, Zili Wang, Ziqiang Cao, Wenjie Li.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2401.14215"><strong>Commonsense-augmented Memory Construction and Management in Long-term Conversations via Context-aware Persona Refinement.</strong></a> <em>Hana Kim, Kai Tzu-iunn Ong, Seoyeon Kim, Dongha Lee, Jinyoung Yeo.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2402.09727v1"><strong>A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts.</strong></a> <em>Kuang-Huei Lee, Xinyun Chen, Hiroki Furuta, John Canny, Ian Fischer.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2402.10453"><strong>Steering Conversational Large Language Models for Long Emotional Support Conversations.</strong></a> <em>Navid Madani, Sougata Saha, Rohini Srihari.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2402.10555"><strong>SPAR: Personalized Content-Based Recommendation via Long Engagement Attention.</strong></a> <em>Chiyu Zhang, Yifei Sun, Jun Chen, Jie Lei, Muhammad Abdul-Mageed, Sinong Wang, Rong Jin, Sem Park, Ning Yao, Bo Long.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2402.11975"><strong>Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conversations.</strong></a> <em>Nuo Chen, Hongguang Li, Juhua Huang, Baoyuan Wang, Jia Li.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/nuochenpku/COMEDY"><img src="https://img.shields.io/github/stars/nuochenpku/COMEDY" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2403.08312"><strong>StreamingDialogue: Prolonged Dialogue Learning via Long Context Compression with Minimal Losses.</strong></a> <em>Jia-Nan Li, Quan Tu, Cunli Mao, Zhengtao Yu, Ji-Rong Wen, Rui Yan.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2404.02319"><strong>Prompts As Programs: A Structure-Aware Approach to Efficient Compile-Time Prompt Optimization.</strong></a> <em>Tobias Schnabel, Jennifer Neville.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/microsoft/sammo"><img src="https://img.shields.io/github/stars/microsoft/sammo" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2405.06067"><strong>HMT: Hierarchical Memory Transformer for Long Context Language Processing.</strong></a> <em>Tobias Schnabel, Jennifer Neville.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/OswaldHe/HMT-pytorch"><img src="https://img.shields.io/github/stars/OswaldHe/HMT-pytorch" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2405.12528"><strong>SirLLM: Streaming Infinite Retentive LLM.</strong></a> <em>Yao Yao, Zuchao Li, Hai Zhao.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/Zoeyyao27/SirLLM"><img src="https://img.shields.io/github/stars/Zoeyyao27/SirLLM" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.00057"><strong>Toward Conversational Agents with Context and Time Sensitive Long-term Memory.</strong></a> <em>Nick Alonso, Tomás Figliolia, Anthony Ndirango, Beren Millidge.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/Zyphra/TemporalMemoryDataset"><img src="https://img.shields.io/github/stars/Zyphra/TemporalMemoryDataset" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2406.02002"><strong>Position Debiasing Fine-Tuning for Causal Perception in Long-Term Dialogue.</strong></a> <em>Shixuan Fan, Wei Wei, Wendi Li, Xian-Ling Mao, Wenfeng Xie, Dangyang Chen.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.06124"><strong>Enhancing Long-Term Memory using Hierarchical Aggregate Tree for Retrieval Augmented Generation.</strong></a> <em>Aadharsh Aadhithya A, Sachin Kumar S, Soman K.P.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.19371"><strong>Suri: Multi-constraint Instruction Following for Long-form Text Generation.</strong></a> <em>Chau Minh Pham, Simeng Sun, Mohit Iyyer.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/chtmp223/suri"><img src="https://img.shields.io/github/stars/chtmp223/suri" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2408.09559"><strong>HiAgent: Hierarchical Working Memory Management for Solving Long-Horizon Agent Tasks with Large Language Model.</strong></a> <em>Mengkang Hu, Tianxing Chen, Qiguang Chen, Yao Mu, Wenqi Shao, Ping Luo.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/HiAgent2024/HiAgent"><img src="https://img.shields.io/github/stars/HiAgent2024/HiAgent" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2410.01696"><strong>CreDes: Causal Reasoning Enhancement and Dual-End Searching for Solving Long-Range Reasoning Problems using LLMs.</strong></a> <em>Kangsheng Wang, Xiao Zhang, Hao Liu, Songde Han, Huimin Ma, Tianyu Hu.</em> Arxiv 2024.</li>
</ol>

<h2 id="7-rag-and-icl">7. RAG and ICL</h2>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2310.05029"><strong>Walking Down the Memory Maze: Beyond Context Limit through Interactive Reading.</strong></a> <em>Howard Chen, Ramakanth Pasunuru, Jason Weston, Asli Celikyilmaz.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2401.04881"><strong>Attendre: Wait To Attend By Retrieval With Evicted Queries in Memory-Based Transformers for Long Context Processing.</strong></a> <em>Zi Yang, Nan Hua.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2402.11573"><strong>BGE Landmark Embedding: A Chunking-Free Embedding Method For Retrieval Augmented Long-Context Large Language Models.</strong></a> <em>Kun Luo, Zheng Liu, Shitao Xiao, Kang Liu.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/FlagOpen/FlagEmbedding"><img src="https://img.shields.io/github/stars/FlagOpen/FlagEmbedding" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2403.14403"><strong>Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity.</strong></a> <em>Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, Jong C. Park.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/starsuzi/Adaptive-RAG"><img src="https://img.shields.io/github/stars/starsuzi/Adaptive-RAG" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2404.00610"><strong>RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation.</strong></a> <em>Chi-Min Chan, Chunpu Xu, Ruibin Yuan, Hongyin Luo, Wei Xue, Yike Guo, Jie Fu.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/chanchimin/RQ-RAG"><img src="https://img.shields.io/github/stars/chanchimin/RQ-RAG" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2404.02022"><strong>Improving Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts.</strong></a> <em>Zhuo Chen, Xinyu Wang, Yong Jiang, Pengjun Xie, Fei Huang, Kewei Tu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2404.06910"><strong>Superposition Prompting: Improving and Accelerating Retrieval-Augmented Generation.</strong></a> <em>Thomas Merth, Qichen Fu, Mohammad Rastegari, Mahyar Najibi.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2404.15103"><strong>Multi-view Content-aware Indexing for Long Document Retrieval.</strong></a> <em>Kuicai Dong, Derrick Goh Xin Deik, Yi Quan Lee, Hao Zhang, Xiangyang Li, Cong Zhang, Yong Liu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2404.15574"><strong>Retrieval Head Mechanistically Explains Long-Context Factuality.</strong></a> <em>Wenhao Wu, Yizhong Wang, Guangxuan Xiao, Hao Peng, Yao Fu.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/nightdessert/Retrieval_Head"><img src="https://img.shields.io/github/stars/nightdessert/Retrieval_Head" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2405.04065"><strong>FlashBack:Efficient Retrieval-Augmented Language Modeling for Long Context Inference.</strong></a> <em>Runheng Liu, Xingchen Xiao, Heyan Huang, Zewen Chi, Zhijing Wu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.10738"><strong>Feature-Adaptive and Data-Scalable In-Context Learning.</strong></a> <em>Jiahao Li, Quan Wang, Licheng Zhang, Guoqing Jin, Zhendong Mao.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/jiahaozhenbang/FADS-ICL"><img src="https://img.shields.io/github/stars/jiahaozhenbang/FADS-ICL" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2405.12035"><strong>KG-RAG: Bridging the Gap Between Knowledge and Creativity.</strong></a> <em>Diego Sanmartin.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/dsanmart/KG-RAG"><img src="https://img.shields.io/github/stars/dsanmart/KG-RAG" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2405.14831"><strong>HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models.</strong></a> <em>Bernal Jiménez Gutiérrez, Yiheng Shu, Yu Gu, Michihiro Yasunaga, Yu Su.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/OSU-NLP-Group/HippoRAG"><img src="https://img.shields.io/github/stars/OSU-NLP-Group/HippoRAG" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2405.14660"><strong>Implicit In-context Learning.</strong></a> <em>Zhuowei Li, Zihao Xu, Ligong Han, Yunhe Gao, Song Wen, Di Liu, Hao Wang, Dimitris N. Metaxas.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/LzVv123456/I2CL"><img src="https://img.shields.io/github/stars/LzVv123456/I2CL" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2405.15318"><strong>Are Long-LLMs A Necessity For Long-Context Tasks?.</strong></a> <em>Hongjin Qian, Zheng Liu, Peitian Zhang, Kelong Mao, Yujia Zhou, Xu Chen, Zhicheng Dou.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.16178"><strong>Accelerating Inference of Retrieval-Augmented Generation via Sparse Context Selection.</strong></a> <em>Yun Zhu, Jia-Chen Gu, Caitlin Sikora, Ho Ko, Yinxiao Liu, Chu-Cheng Lin, Lei Shu, Liangchen Luo, Lei Meng, Bang Liu, Jindong Chen.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.19874"><strong>Is In-Context Learning Sufficient for Instruction Following in LLMs?.</strong></a> <em>Hao Zhao, Maksym Andriushchenko, Francesco Croce, Nicolas Flammarion.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/tml-epfl/icl-alignment"><img src="https://img.shields.io/github/stars/tml-epfl/icl-alignment" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2406.03092"><strong>FragRel: Exploiting Fragment-level Relations in the External Memory of Large Language Models.</strong></a> <em>Xihang Yue, Linchao Zhu, Yi Yang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.05085"><strong>Multi-Head RAG: Solving Multi-Aspect Problems with LLMs.</strong></a> <em>Maciej Besta, Ales Kubicek, Roman Niggli, Robert Gerstenberger, Lucas Weitzendorf, Mingyuan Chi, Patrick Iff, Joanna Gajda, Piotr Nyczyk, Jürgen Müller, Hubert Niewiadomski, Marcin Chrapek, Michał Podstawski, Torsten Hoefler.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/spcl/MRAG"><img src="https://img.shields.io/github/stars/spcl/MRAG" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2406.10878"><strong>Demonstration Notebook: Finding the Most Suited In-Context Learning Example from Interactions.</strong></a> <em>Yiming Tang, Bin Dong.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.12331"><strong>Retrieval Meets Reasoning: Dynamic In-Context Editing for Long-Text Understanding.</strong></a> <em>Weizhi Fei, Xueyan Niu, Guoqing Xie, Yanhua Zhang, Bo Bai, Lei Deng, Wei Han.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.13779"><strong>FoRAG: Factuality-optimized Retrieval Augmented Generation for Web-enhanced Long-form Question Answering.</strong></a> <em>Tianchi Cai, Zhiwen Tan, Xierui Song, Tao Sun, Jiyan Jiang, Yunqi Xu, Yinger Zhang, Jinjie Gu.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://huggingface.co/forag"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2406.13632"><strong>Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations.</strong></a> <em>Arie Cattan, Alon Jacovi, Alex Fabrikant, Jonathan Herzig, Roee Aharoni, Hannah Rashkin, Dror Marcus, Avinatan Hassidim, Yossi Matias, Idan Szpektor, Avi Caciularu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.15319"><strong>LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs.</strong></a> <em>Ziyan Jiang, Xueguang Ma, Wenhu Chen.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/TIGER-AI-Lab/LongRAG"><img src="https://img.shields.io/github/stars/TIGER-AI-Lab/LongRAG" alt="GitHub Repo stars" /></a>
        <a href="https://tiger-ai-lab.github.io/LongRAG/"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2406.15334"><strong>Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning.</strong></a> <em>Brandon Huang, Chancharik Mitra, Assaf Arbelle, Leonid Karlinsky, Trevor Darrell, Roei Herzig.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.19292"><strong>From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data.</strong></a> <em>Zheyang Xiong, Vasilis Papageorgiou, Kangwook Lee, Dimitris Papailiopoulos.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.01178"><strong>Memory3: Language Modeling with Explicit Memory.</strong></a> <em>Hongkang Yang, Zehao Lin, Wenjin Wang, Hao Wu, Zhiyu Li, Bo Tang, Wenqiang Wei, Jinbo Wang, Zeyun Tang, Shichao Song, Chenyang Xi, Yu Yu, Kai Chen, Feiyu Xiong, Linpeng Tang, Weinan E.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.08223"><strong>Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting.</strong></a> <em>Zilong Wang, Zifeng Wang, Long Le, Huaixiu Steven Zheng, Swaroop Mishra, Vincent Perot, Yuwei Zhang, Anush Mattapalli, Ankur Taly, Jingbo Shang, Chen-Yu Lee, Tomas Pfister.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.13101"><strong>Retrieve, Summarize, Plan: Advancing Multi-hop Question Answering with an Iterative Approach.</strong></a> <em>Zhouyu Jiang, Mengshu Sun, Lei Liang, Zhiqiang Zhang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.13249"><strong>R^2AG: Incorporating Retrieval Information into Retrieval Augmented Generation.</strong></a> <em>Fuda Ye, Shuangyin Li, Yongqi Zhang, Lei Chen.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/yefd/RRAG"><img src="https://img.shields.io/github/stars/yefd/RRAG" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2408.03246"><strong>Making Long-Context Language Models Better Multi-Hop Reasoners.</strong></a> <em>Yanyang Li, Shuo Liang, Michael R. Lyu, Liwei Wang.</em> ACL 2024.</li>
</ol>

<p>        <a href="https://github.com/LaVi-Lab/LongContextReasoner"><img src="https://img.shields.io/github/stars/LaVi-Lab/LongContextReasoner" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2408.07505"><strong>Large Language Models Know What Makes Exemplary Contexts.</strong></a> <em>Quanyu Long, Jianda Chen, Wenya Wang, Sinno Jialin Pan.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/ruyue0001/RL-ICL"><img src="https://img.shields.io/github/stars/ruyue0001/RL-ICL" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2408.08067"><strong>RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation.</strong></a> <em>Dongyu Ru, Lin Qiu, Xiangkun Hu, Tianhang Zhang, Peng Shi, Shuaichen Chang, Cheng Jiayang, Cunxiang Wang, Shichao Sun, Huanyu Li, Zizhao Zhang, Binjie Wang, Jiarong Jiang, Tong He, Zhiguo Wang, Pengfei Liu, Yue Zhang, Zheng Zhang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/amazon-science/RAGChecker"><img src="https://img.shields.io/github/stars/amazon-science/RAGChecker" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2408.14906"><strong>Writing in the Margins: Better Inference Pattern for Long Context Retrieval.</strong></a> <em>Melisa Russak, Umar Jamil, Christopher Bryant, Kiran Kamble, Axel Magnuson, Mateusz Russak, Waseem AlShikh.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/writer/writing-in-the-margins"><img src="https://img.shields.io/github/stars/writer/writing-in-the-margins" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2408.16967"><strong>MemLong: Memory-Augmented Retrieval for Long Text Modeling.</strong></a> <em>Weijie Liu, Zecheng Tang, Juntao Li, Kehai Chen, Min Zhang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/Bui1dMySea/MemLong"><img src="https://img.shields.io/github/stars/Bui1dMySea/MemLong" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2409.01666"><strong>In Defense of RAG in the Era of Long-Context Language Models.</strong></a> <em>Tan Yu, Anbang Xu, Rama Akkiraju.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2409.05591"><strong>MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery.</strong></a> <em>Hongjin Qian, Peitian Zhang, Zheng Liu, Kelong Mao, Zhicheng Dou.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2409.13695"><strong>You Only Use Reactive Attention Slice For Long Context Retrieval.</strong></a> <em>Yun Joon Soh, Hanxian Huang, Yuandong Tian, Jishen Zhao.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/yjsoh/youra"><img src="https://img.shields.io/github/stars/yjsoh/youra" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2409.13992"><strong>SMART-RAG: Selection using Determinantal Matrices for Augmented Retrieval.</strong></a> <em>Jiatao Li, Xinyu Hu, Xiaojun Wan.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2409.15699"><strong>Lighter And Better: Towards Flexible Context Adaptation For Retrieval Augmented Generation.</strong></a> <em>Zheng Liu, Chenyuan Wu, Ninglu Shao, Shitao Xiao, Chaozhuo Li, Defu Lian.</em> CIKM 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.01671"><strong>Bridging Context Gaps: Leveraging Coreference Resolution for Long Contextual Understanding.</strong></a> <em>Yanming Liu, Xinyue Peng, Jiannan Cao, Shi Bo, Yanxin Shen, Xuhong Zhang, Sheng Cheng, Xun Wang, Jianwei Yin, Tianyu Du.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.03227"><strong>ALR2: A Retrieve-then-Reason Framework for Long-context Question Answering.</strong></a> <em>Huayang Li, Pat Verga, Priyanka Sen, Bowen Yang, Vijay Viswanathan, Patrick Lewis, Taro Watanabe, Yixuan Su.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.04343"><strong>Inference Scaling for Long-Context Retrieval Augmented Generation.</strong></a> <em>Zhenrui Yue, Honglei Zhuang, Aijun Bai, Kai Hui, Rolf Jagerman, Hansi Zeng, Zhen Qin, Dong Wang, Xuanhui Wang, Michael Bendersky.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.04790"><strong>GARLIC: LLM-Guided Dynamic Progress Control with Hierarchical Weighted Graph for Long Document QA.</strong></a> <em>Xinyu Wang, Yanzheng Xiang, Lin Gui, Yulan He.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.05983"><strong>Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG.</strong></a> <em>Bowen Jin, Jinsung Yoon, Jiawei Han, Sercan O. Arik.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.07176"><strong>Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models.</strong></a> <em>Fei Wang, Xingchen Wan, Ruoxi Sun, Jiefeng Chen, Sercan Ö. Arık.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.06519"><strong>SEGMENT+: Long Text Processing with Short-Context Language Models.</strong></a> <em>Wei Shi, Shuang Li, Kerun Yu, Jinglei Chen, Zujie Liang, Xinhui Wu, Yuxi Qian, Feng Wei, Bo Zheng, Jiaqing Liang, Jiangjie Chen, Yanghua Xiao.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/WeiShi-9/segmentplus"><img src="https://img.shields.io/github/stars/WeiShi-9/segmentplus" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2410.11001"><strong>Graph of Records: Boosting Retrieval Augmented Generation for Long-context Summarization with Graphs.</strong></a> <em>Haozhen Zhang, Tao Feng, Jiaxuan You.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/ulab-uiuc/GoR"><img src="https://img.shields.io/github/stars/ulab-uiuc/GoR" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2410.11119"><strong>ChuLo: Chunk-Level Key Information Representation for Long Document Processing.</strong></a> <em>Yan Li, Caren Han, Yue Dai, Feiqi Cao.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.07590"><strong>TurboRAG: Accelerating Retrieval-Augmented Generation with Precomputed KV Caches for Chunked Text.</strong></a> <em>Songshuo Lu, Hua Wang, Yutian Rong, Zhi Chen, Yaohua Tang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.09342"><strong>LLM×MapReduce: Simplified Long-Sequence Processing using Large Language Models.</strong></a> <em>Zihan Zhou, Chong Li, Xinyi Chen, Shuo Wang, Yu Chao, Zhili Li, Haoyu Wang, Rongqiao An, Qi Shi, Zhixing Tan, Xu Han, Xiaodong Shi, Zhiyuan Liu, Maosong Sun.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/thunlp/LLMxMapReduce"><img src="https://img.shields.io/github/stars/thunlp/LLMxMapReduce" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2410.12859"><strong>Enhancing Long Context Performance in LLMs Through Inner Loop Query Mechanism.</strong></a> <em>Yimin Tang, Yurong Xu, Ning Yan, Masood Mortazavi.</em> NeurIPS 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.18050"><strong>LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering.</strong></a> <em>Qingfei Zhao, Ruobing Wang, Yukuo Cen, Daren Zha, Shicheng Tan, Yuxiao Dong, Jie Tang.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/QingFei1/LongRAG"><img src="https://img.shields.io/github/stars/QingFei1/LongRAG" alt="GitHub Repo stars" /></a></p>

<h2 id="8-agent">8. Agent</h2>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2402.11550"><strong>LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration.</strong></a> <em>Jun Zhao, Can Zu, Hao Xu, Yi Lu, Wei He, Yiwen Ding, Tao Gui, Qi Zhang, Xuanjing Huang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://openreview.net/forum?id=9JQtrumvg8"><strong>A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis.</strong></a> <em>Izzeddin Gur, Hiroki Furuta, Austin V Huang, Mustafa Safdari, Yutaka Matsuo, Douglas Eck, Aleksandra Faust.</em> ICLR 2024 Oral.</p>
  </li>
  <li>
    <p><a href="https://aclanthology.org/2024.eacl-long.29/"><strong>PEARL: Prompting Large Language Models to Plan and Execute Actions Over Long Documents.</strong></a> <em>Simeng Sun, Yang Liu, Shuohang Wang, Dan Iter, Chenguang Zhu, Mohit Iyyer.</em> EACL 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/SimengSun/pearl"><img src="https://img.shields.io/github/stars/SimengSun/pearl" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://openreview.net/forum?id=M6XWoEdmwf"><strong>AMAGO: Scalable In-Context Reinforcement Learning for Adaptive Agents.</strong></a> <em>Jake Grigsby, Linxi Fan, Yuke Zhu.</em> ICLR 2024 Spotlight.</li>
</ol>

<p>        <a href="https://github.com/UT-Austin-RPL/amago"><img src="https://img.shields.io/github/stars/UT-Austin-RPL/amago" alt="GitHub Repo stars" /></a>
        <a href="https://ut-austin-rpl.github.io/amago/"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2406.02818"><strong>Chain of Agents: Large Language Models Collaborating on Long-Context Tasks.</strong></a> <em>Yusen Zhang, Ruoxi Sun, Yanfei Chen, Tomas Pfister, Rui Zhang, Sercan Ö. Arik.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.14550"><strong>GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models.</strong></a> <em>Shilong Li, Yancheng He, Hangyu Guo, Xingyuan Bu, Ge Bai, Jie Liu, Jiaheng Liu, Xingwei Qu, Yangguang Li, Wanli Ouyang, Wenbo Su, Bo Zheng.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.09893"><strong>Synergistic Multi-Agent Framework with Trajectory Learning for Knowledge-Intensive Tasks.</strong></a> <em>Shengbin Yue, Siyuan Wang, Wei Chen, Xuanjing Huang, Zhongyu Wei.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2408.03615"><strong>Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks.</strong></a> <em>Zaijing Li, Yuquan Xie, Rui Shao, Gongwei Chen, Dongmei Jiang, Liqiang Nie.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/JiuTian-VL/Optimus-1"><img src="https://img.shields.io/github/stars/JiuTian-VL/Optimus-1" alt="GitHub Repo stars" /></a>
        <a href="https://cybertronagent.github.io/Optimus-1.github.io/"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<h2 id="9-compress">9. Compress</h2>

<ol>
  <li><a href="https://arxiv.org/abs/2305.14788"><strong>Adapting Language Models to Compress Contexts.</strong></a> <em>Alexis Chevalier, Alexander Wettig, Anirudh Ajith, Danqi Chen.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/princeton-nlp/AutoCompressors"><img src="https://img.shields.io/github/stars/princeton-nlp/AutoCompressors" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2310.06201"><strong>Compressing Context to Enhance Inference Efficiency of Large Language Models.</strong></a> <em>Yucheng Li, Bo Dong, Chenghua Lin, Frank Guerin.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/liyucheng09/Selective_Context"><img src="https://img.shields.io/github/stars/liyucheng09/Selective_Context" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2310.05736"><strong>LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models.</strong></a> <em>Huiqiang Jiang, Qianhui Wu, Chin-Yew Lin, Yuqing Yang, Lili Qiu.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/microsoft/LLMLingua"><img src="https://img.shields.io/github/stars/microsoft/LLMLingua" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2310.06839"><strong>LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression.</strong></a> <em>Huiqiang Jiang, Qianhui Wu, Xufang Luo, Dongsheng Li, Chin-Yew Lin, Yuqing Yang, Lili Qiu.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/microsoft/LLMLingua"><img src="https://img.shields.io/github/stars/microsoft/LLMLingua" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2311.11829"><strong>System 2 Attention (is something you might need too).</strong></a> <em>Jason Weston, Sainbayar Sukhbaatar.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2312.13211"><strong>DSFormer: Effective Compression of Text-Transformers by Dense-Sparse Weight Factorization.</strong></a> <em>Rahul Chand, Yashoteja Prabhu, Pratyush Kumar.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2401.03462"><strong>Soaring from 4K to 400K: Extending LLM’s Context with Activation Beacon.</strong></a> <em>Peitian Zhang, Zheng Liu, Shitao Xiao, Ninglu Shao, Qiwei Ye, Zhicheng Dou.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/FlagOpen/FlagEmbedding"><img src="https://img.shields.io/github/stars/FlagOpen/FlagEmbedding" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2401.07793"><strong>Flexibly Scaling Large Language Models Contexts Through Extensible Tokenization.</strong></a> <em>Ninglu Shao, Shitao Xiao, Zheng Liu, Peitian Zhang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/FlagOpen/FlagEmbedding"><img src="https://img.shields.io/github/stars/FlagOpen/FlagEmbedding" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2402.16058"><strong>Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression.</strong></a> <em>Xinze Li, Zhenghao Liu, Chenyan Xiong, Shi Yu, Yukun Yan, Shuo Wang, Ge Yu.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/OpenMatch/Gist-COCO"><img src="https://img.shields.io/github/stars/OpenMatch/Gist-COCO" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2402.18700"><strong>Learning to Compress Prompt in Natural Language Formats.</strong></a> <em>Yu-Neng Chuang, Tianwei Xing, Chia-Yuan Chang, Zirui Liu, Xun Chen, Xia Hu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2403.09636"><strong>Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference.</strong></a> <em>Piotr Nawrot, Adrian Łańcucki, Marcin Chochowski, David Tarjan, Edoardo M. Ponti.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2403.12968"><strong>LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression.</strong></a> <em>Zhuoshi Pan, Qianhui Wu, Huiqiang Jiang, Menglin Xia, Xufang Luo, Jue Zhang, Qingwei Lin, Victor Rühle, Yuqing Yang, Chin-Yew Lin, H. Vicky Zhao, Lili Qiu, Dongmei Zhang.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/microsoft/LLMLingua"><img src="https://img.shields.io/github/stars/microsoft/LLMLingua" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2403.17411"><strong>PCToolkit: A Unified Plug-and-Play Prompt Compression Toolkit of Large Language Models.</strong></a> <em>Jinyi Li, Yihuai Lan, Lei Wang, Hao Wang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/3DAgentWorld/Toolkit-for-Prompt-Compression"><img src="https://img.shields.io/github/stars/3DAgentWorld/Toolkit-for-Prompt-Compression" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2312.03414"><strong>Compressed Context Memory for Online Language Model Interaction.</strong></a> <em>Jang-Hyun Kim, Junyoung Yeom, Sangdoo Yun, Hyun Oh Song.</em> ICLR 2024.</li>
</ol>

<p>        <a href="https://github.com/snu-mllab/context-memory"><img src="https://img.shields.io/github/stars/snu-mllab/context-memory" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2403.19135"><strong>Compressing Large Language Models by Streamlining the Unimportant Layer.</strong></a> <em>Xiaodong Chen, Yuxuan Hu, Jing Zhang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2404.00489"><strong>PROMPT-SAW: Leveraging Relation-Aware Graphs for Textual Prompt Compression.</strong></a> <em>Muhammad Asif Ali, Zhengping Li, Shu Yang, Keyuan Cheng, Yang Cao, Tianhao Huang, Lijie Hu, Lu Yu, Di Wang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2404.03626"><strong>Training LLMs over Neurally Compressed Text.</strong></a> <em>Brian Lester, Jaehoon Lee, Alex Alemi, Jeffrey Pennington, Adam Roberts, Jascha Sohl-Dickstein, Noah Constant.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2404.02657"><strong>Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models.</strong></a> <em>Taiqiang Wu, Chaofan Tao, Jiahao Wang, Zhe Zhao, Ngai Wong.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2404.04997"><strong>Adapting LLMs for Efficient Context Processing through Soft Prompt Compression.</strong></a> <em>Cangqing Wang, Yutian Yang, Ruisi Li, Dan Sun, Ruicong Cai, Yuzhu Zhang, Chengqian Fu, Lillian Floyd.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://openreview.net/forum?id=uNrFpDPMyo"><strong>Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs.</strong></a> <em>Suyu Ge, Yunan Zhang, Liyuan Liu, Minjia Zhang, Jiawei Han, Jianfeng Gao.</em> ICLR 2024 Oral.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2404.07979"><strong>LLoCO: Learning Long Contexts Offline.</strong></a> <em>Sijun Tan, Xiuyu Li, Shishir Patil, Ziyang Wu, Tianjun Zhang, Kurt Keutzer, Joseph E. Gonzalez, Raluca Ada Popa.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/jeffreysijuntan/lloco"><img src="https://img.shields.io/github/stars/jeffreysijuntan/lloco" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2404.11225"><strong>In-Context Learning State Vector with Inner and Momentum Optimization.</strong></a> <em>Dongfang Li, Zhenyu Liu, Xinshuo Hu, Zetian Sun, Baotian Hu, Min Zhang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/HITsz-TMG/ICL-State-Vector"><img src="https://img.shields.io/github/stars/HITsz-TMG/ICL-State-Vector" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2405.03085"><strong>Compressing Long Context for Enhancing RAG with AMR-based Concept Distillation.</strong></a> <em>Kaize Shi, Xueyao Sun, Qing Li, Guandong Xu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.04955"><strong>Improving Long Text Understanding with Knowledge Distilled from Summarization Model.</strong></a> <em>Yan Liu, Yazheng Yang, Xiaokang Chen.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.05957"><strong>OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning.</strong></a> <em>Dan Qiao, Yi Su, Pinzheng Wang, Jing Ye, Wenjing Xie, Yuechi Zhou, Yuyang Ding, Zecheng Tang, Jikai Wang, Yixin Ji, Yue Wang, Pei Guo, Zechen Sun, Zikang Zhang, Juntao Li, Pingfu Chao, Wenliang Chen, Guohong Fu, Guodong Zhou, Qiaoming Zhu, Min Zhang.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/OpenNLG/OpenBA-v2"><img src="https://img.shields.io/github/stars/OpenNLG/OpenBA-v2" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2405.10616"><strong>Feature-based Low-Rank Compression of Large Language Models via Bayesian Optimization.</strong></a> <em>Yixin Ji, Yang Xiang, Juntao Li, Wei Chen, Zhongyi Liu, Kehai Chen, Min Zhang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/Dereck0602/Bolaco"><img src="https://img.shields.io/github/stars/Dereck0602/Bolaco" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2403.15268"><strong>Imagination Augmented Generation: Learning to Imagine Richer Context for Question Answering over Large Language Models.</strong></a> <em>Huanxuan Liao, Shizhu He, Yao Xu, Yuanzhe Zhang, Kang Liu, Shengping Liu, Jun Zhao.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/Xnhyacinth/IAG"><img src="https://img.shields.io/github/stars/Xnhyacinth/IAG" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2405.12250"><strong>Your Transformer is Secretly Linear.</strong></a> <em>Anton Razzhigaev, Matvey Mikhalchuk, Elizaveta Goncharova, Nikolai Gerasimenko, Ivan Oseledets, Denis Dimitrov, Andrey Kuznetsov.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/AIRI-Institute/LLM-Microscope"><img src="https://img.shields.io/github/stars/AIRI-Institute/LLM-Microscope" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2405.13792"><strong>xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token.</strong></a> <em>Xin Cheng, Xun Wang, Xingxing Zhang, Tao Ge, Si-Qing Chen, Furu Wei, Huishuai Zhang, Dongyan Zhao.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/Hannibal046/xRAG"><img src="https://img.shields.io/github/stars/Hannibal046/xRAG" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2405.17052"><strong>SelfCP: Compressing Long Prompt to 1/12 Using the Frozen Large Language Model Itself.</strong></a> <em>Jun Gao.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.16635"><strong>Compressing Lengthy Context With UltraGist.</strong></a> <em>Peitian Zhang, Zheng Liu, Shitao Xiao, Ninglu Shao, Qiwei Ye, Zhicheng Dou.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/namespace-Pt/UltraGist"><img src="https://img.shields.io/github/stars/namespace-Pt/UltraGist" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2405.17755"><strong>XL3M: A Training-free Framework for LLM Length Extension Based on Segment-wise Inference.</strong></a> <em>Shengnan Wang, Youhui Bai, Lin Zhang, Pingyi Zhou, Shixiong Zhao, Gong Zhang, Sen Wang, Renhai Chen, Hua Xu, Hongwei Sun.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://openreview.net/forum?id=uREj4ZuGJE"><strong>In-context Autoencoder for Context Compression in a Large Language Model.</strong></a> <em>Tao Ge, Hu Jing, Lei Wang, Xun Wang, Si-Qing Chen, Furu Wei.</em> ICLR 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/getao/icae"><img src="https://img.shields.io/github/stars/getao/icae" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.02376"><strong>Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs.</strong></a> <em>Zhiwei Cao, Qian Cao, Yu Lu, Ningxin Peng, Luyang Huang, Shanbo Cheng, Jinsong Su.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/DeepLearnXMU/QGC"><img src="https://img.shields.io/github/stars/DeepLearnXMU/QGC" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.06110"><strong>Recurrent Context Compression: Efficiently Expanding the Context Window of LLM.</strong></a> <em>Chensen Huang, Guibo Zhu, Xuepeng Wang, Yifei Luo, Guojing Ge, Haoran Chen, Dong Yi, Jinqiao Wang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/WUHU-G/RCC_Transformer"><img src="https://img.shields.io/github/stars/WUHU-G/RCC_Transformer" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.05317"><strong>LoCoCo: Dropping In Convolutions for Long Context Compression.</strong></a> <em>Ruisi Cai, Yuandong Tian, Zhangyang Wang, Beidi Chen.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/VITA-Group/LoCoCo"><img src="https://img.shields.io/github/stars/VITA-Group/LoCoCo" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2406.06773"><strong>Evaluating Zero-Shot Long-Context LLM Compression.</strong></a> <em>Chenyu Wang, Yihan Wang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.11097"><strong>InstructCMP: Length Control in Sentence Compression through Instruction-based Large Language Models.</strong></a> <em>Juseon-Do, Jingun Kwon, Hidetaka Kamigaito, Manabu Okumura.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/JuseonDo/InstructCMP"><img src="https://img.shields.io/github/stars/JuseonDo/InstructCMP" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2306.00978"><strong>AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration.</strong></a> <em>Ji Lin, Jiaming Tang, Haotian Tang, Shang Yang, Wei-Ming Chen, Wei-Chen Wang, Guangxuan Xiao, Xingyu Dang, Chuang Gan, Song Han.</em> MLSys 2024 Best Paper Award.</li>
</ol>

<p>        <a href="https://github.com/JuseonDo/InstructCMP"><img src="https://img.shields.io/github/stars/JuseonDo/InstructCMP" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2406.13618"><strong>In-Context Former: Lightning-fast Compressing Context for Large Language Model.</strong></a> <em>Xiangfeng Wang, Zaiyi Chen, Zheyong Xie, Tong Xu, Yongyi He, Enhong Chen.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.18173"><strong>UIO-LLMs: Unbiased Incremental Optimization for Long-Context LLMs.</strong></a> <em>Wenhao Li, Mingbao Lin, Yunshan Zhong, Shuicheng Yan, Rongrong Ji.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/wenhaoli-xmu/UIO-LLMs"><img src="https://img.shields.io/github/stars/wenhaoli-xmu/UIO-LLMs" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2407.02211"><strong>PromptIntern: Saving Inference Costs by Internalizing Recurrent Prompt during Large Language Model Fine-tuning.</strong></a> <em>Jiaru Zou, Mengyu Zhou, Tao Li, Shi Han, Dongmei Zhang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.02043"><strong>Concise and Precise Context Compression for Tool-Using Language Models.</strong></a> <em>Yang Xu, Yunlong Feng, Honglin Mu, Yutai Hou, Yitong Li, Xinghao Wang, Wanjun Zhong, Zhongyang Li, Dandan Tu, Qingfu Zhu, Min Zhang, Wanxiang Che.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.09252"><strong>Context Embeddings for Efficient Answer Generation in RAG.</strong></a> <em>David Rau, Shuai Wang, Hervé Déjean, Stéphane Clinchant.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.08892"><strong>Characterizing Prompt Compression Methods for Long Context Inference.</strong></a> <em>Siddharth Jha, Lutfi Eren Erdogan, Sehoon Kim, Kurt Keutzer, Amir Gholami.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.15504"><strong>Fundamental Limits of Prompt Compression: A Rate-Distortion Framework for Black-Box Language Models.</strong></a> <em>Adway Girish, Alliot Nagle, Marco Bondaschi, Michael Gastpar, Ashok Vardhan Makkuva, Hyeji Kim.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2408.00274"><strong>QUITO: Accelerating Long-Context Reasoning through Query-Guided Context Compression.</strong></a> <em>Wenshan Wang, Yihang Wang, Yixing Fan, Huaming Liao, Jiafeng Guo.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/Wenshansilvia/attention_compressor"><img src="https://img.shields.io/github/stars/Wenshansilvia/attention_compressor" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2408.00655"><strong>SentenceVAE: Faster, Longer and More Accurate Inference with Next-sentence Prediction for Large Language Models.</strong></a> <em>Hongjun An, Yifan Chen, Xiaozhen Qiao, Zhe Sun, Xuelong Li.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2408.10497"><strong>QUITO-X: An Information Bottleneck-based Compression Algorithm with Cross-Attention.</strong></a> <em>Yihang Wang, Xu Huang, Bowen Tian, Yixing Fan, Jiafeng Guo.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2409.01579"><strong>AdaComp: Extractive Context Compression with Adaptive Predictor for Retrieval-Augmented Large Language Models.</strong></a> <em>Qianchi Zhang, Hainan Zhang, Liang Pang, Hongwei Zheng, Zhiming Zheng.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2409.01227"><strong>Prompt Compression with Context-Aware Sentence Encoding for Fast and Improved LLM Inference.</strong></a> <em>Barys Liskavets, Maxim Ushakov, Shuvendu Roy, Mark Klibanov, Ali Etemad, Shane Luke.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/Workday/cpc"><img src="https://img.shields.io/github/stars/Workday/cpc" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2409.12468"><strong>Familiarity-aware Evidence Compression for Retrieval Augmented Generation.</strong></a> <em>Dongwon Jung, Qin Liu, Tenghao Huang, Ben Zhou, Muhao Chen.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/luka-group/FaviComp"><img src="https://img.shields.io/github/stars/luka-group/FaviComp" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2409.13035"><strong>TACO-RL: Task Aware Prompt Compression Optimization with Reinforcement Learning.</strong></a> <em>Shivam Shandilya, Menglin Xia, Supriyo Ghosh, Huiqiang Jiang, Jue Zhang, Qianhui Wu, Victor Rühle.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2409.15395"><strong>Parse Trees Guided LLM Prompt Compression.</strong></a> <em>Wenhao Mao, Chengbin Hou, Tianyu Zhang, Xinyu Lin, Ke Tang, Hairong Lv.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2409.17141"><strong>FineZip: Pushing the Limits of Large Language Models for Practical Lossless Text Compression.</strong></a> <em>Fazal Mittu, Yihuan Bu, Akshat Gupta, Ashok Devireddy, Alp Eren Ozdarendeli, Anant Singh, Gopala Anumanchipalli.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/fazalmittu/FineZip"><img src="https://img.shields.io/github/stars/fazalmittu/FineZip" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2409.19272"><strong>Perception Compressor:A training-free prompt compression method in long context scenarios.</strong></a> <em>Jiwei Tang, Jin Xu, Tingwei Lu, Hai Lin, Yiming Zhao, Hai-Tao Zheng.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.04139"><strong>From Reading to Compressing: Exploring the Multi-document Reader for Prompt Compression.</strong></a> <em>Eunseong Choi, Sunkyung Lee, Minjin Choi, June Park, Jongwuk Lee.</em> EMNLP 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.11786"><strong>Selection-p: Self-Supervised Task-Agnostic Prompt Compression for Faithfulness and Transferability.</strong></a> <em>Tsz Ting Chung, Leyang Cui, Lemao Liu, Xinting Huang, Shuming Shi, Dit-Yan Yeung.</em> EMNLP 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.14042"><strong>Style-Compress: An LLM-Based Prompt Compression Framework Considering Task-Specific Styles.</strong></a> <em>Xiao Pu, Tianxing He, Xiaojun Wan.</em> EMNLP 2024.</p>
  </li>
</ol>

<h2 id="10-long-video-and-image">10. Long Video and Image</h2>

<ol>
  <li><a href="https://arxiv.org/abs/2405.18991"><strong>EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture.</strong></a> <em>Jiaqi Xu, Xinyi Zou, Kunzhe Huang, Yunkuo Chen, Bo Liu, MengLi Cheng, Xing Shi, Jun Huang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/aigc-apps/EasyAnimate"><img src="https://img.shields.io/github/stars/aigc-apps/EasyAnimate" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2405.19209"><strong>VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos.</strong></a> <em>Ziyang Wang, Shoubin Yu, Elias Stengel-Eskin, Jaehong Yoon, Feng Cheng, Gedas Bertasius, Mohit Bansal.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.20213"><strong>PostDoc: Generating Poster from a Long Multimodal Document Using Deep Submodular Optimization.</strong></a> <em>Vijay Jaisankar, Sambaran Bandyopadhyay, Kalp Vyas, Varre Chaitanya, Shwetha Somasundaram.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.10923"><strong>Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies.</strong></a> <em>Hung-Ting Su, Chun-Tong Chao, Ya-Ching Hsu, Xudong Lin, Yulei Niu, Hung-Yi Lee, Winston H. Hsu.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/ander1119/TiM"><img src="https://img.shields.io/github/stars/ander1119/TiM" alt="GitHub Repo stars" /></a>
        <a href="https://ander1119.github.io/TiM/"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.14129"><strong>Towards Event-oriented Long Video Understanding.</strong></a> <em>Yifan Du, Kun Zhou, Yuqi Huo, Yifan Li, Wayne Xin Zhao, Haoyu Lu, Zijia Zhao, Bingning Wang, Weipeng Chen, Ji-Rong Wen.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/RUCAIBox/Event-Bench"><img src="https://img.shields.io/github/stars/RUCAIBox/Event-Bench" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2407.02005"><strong>An End-to-End Speech Summarization Using Large Language Model.</strong></a> <em>Hengchao Shang, Zongyao Li, Jiaxin Guo, Shaojun Li, Zhiqiang Rao, Yuanchang Luo, Daimeng Wei, Hao Yang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.03104"><strong>KeyVideoLLM: Towards Large-scale Video Keyframe Selection.</strong></a> <em>Hao Liang, Jiapeng Li, Tianyi Bai, Chong Chen, Conghui He, Bin Cui, Wentao Zhang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.04923"><strong>OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding.</strong></a> <em>Tiancheng Zhao, Qianqian Zhang, Kyusong Lee, Peng Liu, Lu Zhang, Chunxin Fang, Jiajia Liao, Kelei Jiang, Yibo Ma, Ruochen Xu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.09541"><strong>MATE: Meet At The Embedding – Connecting Images with Long Texts.</strong></a> <em>Young Kyun Jang, Junmo Kang, Yong Jae Lee, Donghyun Kim.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2408.04840"><strong>mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models.</strong></a> <em>Jiabo Ye, Haiyang Xu, Haowei Liu, Anwen Hu, Ming Yan, Qi Qian, Ji Zhang, Fei Huang, Jingren Zhou.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/X-PLUG/mPLUG-Owl"><img src="https://img.shields.io/github/stars/X-PLUG/mPLUG-Owl" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2408.10188"><strong>LongVILA: Scaling Long-Context Visual Language Models for Long Videos.</strong></a> <em>Fuzhao Xue, Yukang Chen, Dacheng Li, Qinghao Hu, Ligeng Zhu, Xiuyu Li, Yunhao Fang, Haotian Tang, Shang Yang, Zhijian Liu, Ethan He, Hongxu Yin, Pavlo Molchanov, Jan Kautz, Linxi Fan, Yuke Zhu, Yao Lu, Song Han.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/NVlabs/VILA/blob/main/LongVILA.md"><img src="https://img.shields.io/github/stars/NVlabs/VILA" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2408.11788"><strong>DreamFactory: Pioneering Multi-Scene Long Video Generation with a Multi-Agent Framework.</strong></a> <em>Zhifei Xie, Daniel Tang, Dingwei Tan, Jacques Klein, Tegawend F. Bissyand, Saad Ezzini.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2408.17443"><strong>Bridging Episodes and Semantics: A Novel Framework for Long-Form Video Understanding.</strong></a> <em>Gueter Josmy Faure, Jia-Fong Yeh, Min-Hung Chen, Hung-Ting Su, Winston H. Hsu, Shang-Hong Lai.</em> ECCV 2024 Workshop.</p>
  </li>
</ol>

<p>        <a href="https://github.com/joslefaure/HERMES"><img src="https://img.shields.io/github/stars/joslefaure/HERMES" alt="GitHub Repo stars" /></a>
        <a href="https://joslefaure.github.io/assets/html/hermes.html"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2409.01071"><strong>VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges.</strong></a> <em>Yuxuan Wang, Cihang Xie, Yang Liu, Zilong Zheng.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/bigai-nlco/VideoLLaMB"><img src="https://img.shields.io/github/stars/bigai-nlco/VideoLLaMB" alt="GitHub Repo stars" /></a>
        <a href="https://videollamb.github.io/"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2409.05601"><strong>Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training for Enhanced Speech Recognition and Translation.</strong></a> <em>Nithin Rao Koluguri, Travis Bartley, Hainan Xu, Oleksii Hrinchuk, Jagadeesh Balam, Boris Ginsburg, Georg Kucsko.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2409.02889"><strong>LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture.</strong></a> <em>Xidong Wang, Dingjie Song, Shunian Chen, Chen Zhang, Benyou Wang.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/FreedomIntelligence/LongLLaVA"><img src="https://img.shields.io/github/stars/FreedomIntelligence/LongLLaVA" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2410.00741"><strong>VideoCLIP-XL: Advancing Long Description Understanding for Video CLIP Models.</strong></a> <em>Jiapeng Wang, Chengyu Wang, Kunzhe Huang, Jun Huang, Lianwen Jin.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.19732"><strong>Rethinking Visual Dependency in Long-Context Reasoning for Large Vision-Language Models.</strong></a> <em>Yucheng Zhou, Zhi Rao, Jun Wan, Jianbing Shen.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.23277"><strong>SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation.</strong></a> <em>Yining Hong, Beide Liu, Maxine Wu, Yuanhao Zhai, Kai-Wei Chang, Lingjie Li, Kevin Lin, Chung-Ching Lin, Jianfeng Wang, Zhengyuan Yang, Yingnian Wu, Lijuan Wang.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/slowfast-vgen/slowfast-vgen"><img src="https://img.shields.io/github/stars/slowfast-vgen/slowfast-vgen" alt="GitHub Repo stars" /></a>
        <a href="https://slowfast-vgen.github.io/"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<h2 id="11-benchmark-and-evaluation">11. Benchmark and Evaluation</h2>

<h3 id="111-llm">11.1 LLM</h3>

<ol>
  <li><a href="https://arxiv.org/abs/2011.04006"><strong>Long Range Arena : A Benchmark for Efficient Transformers.</strong></a> <em>Yi Tay, Mostafa Dehghani, Samira Abnar, Yikang Shen, Dara Bahri, Philip Pham, Jinfeng Rao, Liu Yang, Sebastian Ruder, Donald Metzler.</em> ICLR 2021.</li>
</ol>

<p>        <a href="https://github.com/google-research/long-range-arena"><img src="https://img.shields.io/github/stars/google-research/long-range-arena" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://aclanthology.org/2022.tacl-1.25.pdf"><strong>LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text Understanding and Generation.</strong></a> <em>Jian Guan, Zhuoer Feng, Yamei Chen, Ruilin He, Xiaoxi Mao, Changjie Fan, Minlie Huang.</em> TACL 2022.</li>
</ol>

<p>        <a href="https://github.com/thu-coai/LOT-LongLM"><img src="https://img.shields.io/github/stars/thu-coai/LOT-LongLM" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2201.03533"><strong>SCROLLS: Standardized CompaRison Over Long Language Sequences.</strong></a> <em>Uri Shaham, Elad Segal, Maor Ivgi, Avia Efrat, Ori Yoran, Adi Haviv, Ankit Gupta, Wenhan Xiong, Mor Geva, Jonathan Berant, Omer Levy.</em> EMNLP 2022.</li>
</ol>

<p>        <a href="https://github.com/tau-nlp/scrolls"><img src="https://img.shields.io/github/stars/tau-nlp/scrolls" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://aclanthology.org/2022.lrec-1.392/"><strong>MuLD: The Multitask Long Document Benchmark.</strong></a> <em>George Hudson, Noura Al Moubayed.</em> LREC 2022.</li>
</ol>

<p>        <a href="https://github.com/ghomasHudson/muld"><img src="https://img.shields.io/github/stars/ghomasHudson/muld" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2307.03172"><strong>Lost in the Middle: How Language Models Use Long Contexts.</strong></a> <em>Nelson F. Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, Percy Liang.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/nelson-liu/lost-in-the-middle"><img src="https://img.shields.io/github/stars/nelson-liu/lost-in-the-middle" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2307.11088"><strong>L-Eval: Instituting Standardized Evaluation for Long Context Language Models.</strong></a> <em>Chenxin An, Shansan Gong, Ming Zhong, Mukai Li, Jun Zhang, Lingpeng Kong, Xipeng Qiu.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/OpenLMLab/LEval"><img src="https://img.shields.io/github/stars/OpenLMLab/LEval" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2308.14508"><strong>LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding.</strong></a> <em>Yushi Bai, Xin Lv, Jiajie Zhang, Hongchang Lyu, Jiankai Tang, Zhidian Huang, Zhengxiao Du, Xiao Liu, Aohan Zeng, Lei Hou, Yuxiao Dong, Jie Tang, Juanzi Li.</em> Arxiv 2023.</li>
</ol>

<p>        <a href="https://github.com/THUDM/LongBench"><img src="https://img.shields.io/github/stars/THUDM/LongBench" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2309.06009"><strong>Content Reduction, Surprisal and Information Density Estimation for Long Documents.</strong></a> <em>Shaoxiong Ji, Wei Sun, Pekka Marttinen.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2309.13345"><strong>BAMBOO: A Comprehensive Benchmark for Evaluating Long Text Modeling Capacities of Large Language Models.</strong></a> <em>Zican Dong, Tianyi Tang, Junyi Li, Wayne Xin Zhao, Ji-Rong Wen.</em> Arxiv 2023.</p>
  </li>
</ol>

<p>        <a href="https://github.com/RUCAIBox/BAMBOO"><img src="https://img.shields.io/github/stars/RUCAIBox/BAMBOO" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2309.13345"><strong>Retrieval meets Long Context Large Language Models.</strong></a> <em>Peng Xu, Wei Ping, Xianchao Wu, Lawrence McAfee, Chen Zhu, Zihan Liu, Sandeep Subramanian, Evelina Bakhturina, Mohammad Shoeybi, Bryan Catanzaro.</em> Arxiv 2023.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/pdf/2311.04939v1.pdf"><strong>LooGLE: Long Context Evaluation for Long-Context Language Models.</strong></a> <em>Jiaqi Li, Mengmeng Wang, Zilong Zheng, Muhan Zhang.</em> Arxiv 2023.</p>
  </li>
</ol>

<p>        <a href="https://github.com/bigai-nlco/loogle"><img src="https://img.shields.io/github/stars/bigai-nlco/loogle" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2401.04925v1"><strong>The Impact of Reasoning Step Length on Large Language Models.</strong></a> <em>Mingyu Jin, Qinkai Yu, Dong shu, Haiyan Zhao, Wenyue Hua, Yanda Meng, Yongfeng Zhang, Mengnan Du.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2401.06915"><strong>DocFinQA: A Long-Context Financial Reasoning Dataset.</strong></a> <em>Varshini Reddy, Rik Koncel-Kedziorski, Viet Dac Lai, Chris Tanner.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2401.15050"><strong>LongFin: A Multimodal Document Understanding Model for Long Financial Domain Documents.</strong></a> <em>Ahmed Masry, Amir Hajian.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2401.15042"><strong>PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models.</strong></a> <em>Haochen Tan, Zhijiang Guo, Zhan Shi, Lu Xu, Zhili Liu, Xiaoguang Li, Yasheng Wang, Lifeng Shang, Qun Liu, Linqi Song.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2401.14490"><strong>LongHealth: A Question Answering Benchmark with Long Clinical Documents.</strong></a> <em>Lisa Adams, Felix Busch, Tianyu Han, Jean-Baptiste Excoffier, Matthieu Ortala, Alexander Löser, Hugo JWL. Aerts, Jakob Nikolas Kather, Daniel Truhn, Keno Bressem.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2402.09394"><strong>Long-form evaluation of model editing.</strong></a> <em>Domenic Rosati, Robie Gonzales, Jinkun Chen, Xuemin Yu, Melis Erkan, Yahya Kayani, Satya Deepika Chavatapalli, Frank Rudzicz, Hassan Sajjad.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2402.10790v1"><strong>In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss.</strong></a> <em>Yuri Kuratov, Aydar Bulatov, Petr Anokhin, Dmitry Sorokin, Artyom Sorokin, Mikhail Burtsev.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/booydar/babilong"><img src="https://img.shields.io/github/stars/booydar/babilong" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2402.13718"><strong>∞Bench: Extending Long Context Evaluation Beyond 100K Tokens.</strong></a> <em>Xinrong Zhang, Yingfa Chen, Shengding Hu, Zihang Xu, Junhao Chen, Moo Khai Hao, Xu Han, Zhen Leng Thai, Shuo Wang, Zhiyuan Liu, Maosong Sun.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2402.14848"><strong>Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models.</strong></a> <em>Mosh Levy, Alon Jacoby, Yoav Goldberg.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/alonj/Same-Task-More-Tokens"><img src="https://img.shields.io/github/stars/alonj/Same-Task-More-Tokens" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2402.17753"><strong>Evaluating Very Long-Term Conversational Memory of LLM Agents.</strong></a> <em>Adyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Barbieri, Yuwei Fang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/snap-research/LoCoMo"><img src="https://img.shields.io/github/stars/snap-research/LoCoMo" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2402.11111"><strong>Language Models as Science Tutors.</strong></a> <em>Alexis Chevalier, Jiayi Geng, Alexander Wettig, Howard Chen, Sebastian Mizera, Toni Annala, Max Jameson Aragon, Arturo Rodríguez Fanlo, Simon Frieder, Simon Machado, Akshara Prabhakar, Ellie Thieu, Jiachen T. Wang, Zirui Wang, Xindi Wu, Mengzhou Xia, Wenhan Jia, Jiatong Yu, Jun-Jie Zhu, Zhiyong Jason Ren, Sanjeev Arora, Danqi Chen.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/princeton-nlp/LM-Science-Tutor"><img src="https://img.shields.io/github/stars/princeton-nlp/LM-Science-Tutor" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://github.com/gkamradt/LLMTest_NeedleInAHaystack"><strong>Needle in a haystack - pressure testing llms.</strong></a> <em>Kamradt, G.</em> Github 2024.</li>
</ol>

<p>        <a href="https://github.com/gkamradt/LLMTest_NeedleInAHaystack"><img src="https://img.shields.io/github/stars/gkamradt/LLMTest_NeedleInAHaystack" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2402.10790"><strong>In Search of Needles in a 11M Haystack: Recurrent Memory Finds What LLMs Miss.</strong></a> <em>Yuri Kuratov, Aydar Bulatov, Petr Anokhin, Dmitry Sorokin, Artyom Sorokin, Mikhail Burtsev.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/booydar/babilong"><img src="https://img.shields.io/github/stars/booydar/babilong" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2402.05136"><strong>LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K.</strong></a> <em>Tao Yuan, Xuefei Ning, Dong Zhou, Zhijie Yang, Shiyao Li, Minghui Zhuang, Zheyue Tan, Zhuyu Yao, Dahua Lin, Boxun Li, Guohao Dai, Shengen Yan, Yu Wang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/infinigence/LVEval"><img src="https://img.shields.io/github/stars/infinigence/LVEval" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2403.11802"><strong>Counting-Stars: A Simple, Efficient, and Reasonable Strategy for Evaluating Long-Context Large Language Models.</strong></a> <em>Mingyang Song, Mao Zheng, Xuan Luo.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/nick7nlp/Counting-Stars"><img src="https://img.shields.io/github/stars/nick7nlp/Counting-Stars" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2403.12766"><strong>NovelQA: A Benchmark for Long-Range Novel Question Answering.</strong></a> <em>Cunxiang Wang, Ruoxi Ning, Boqi Pan, Tonghui Wu, Qipeng Guo, Cheng Deng, Guangsheng Bao, Qian Wang, Yue Zhang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/NovelQA/novelqa.github.io"><img src="https://img.shields.io/github/stars/NovelQA/novelqa.github.io" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2403.18802"><strong>Long-form factuality in large language models.</strong></a> <em>Jerry Wei, Chengrun Yang, Xinying Song, Yifeng Lu, Nathan Hu, Dustin Tran, Daiyi Peng, Ruibo Liu, Da Huang, Cosmo Du, Quoc V. Le.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/google-deepmind/long-form-factuality"><img src="https://img.shields.io/github/stars/google-deepmind/long-form-factuality" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2403.20279"><strong>LUQ: Long-text Uncertainty Quantification for LLMs.</strong></a> <em>JCaiqi Zhang, Fangyu Liu, Marco Basaldella, Nigel Collier.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2403.03514"><strong>CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models.</strong></a> <em>Zexuan Qiu, Jingjing Li, Shijue Huang, Wanjun Zhong, Irwin King.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/zexuanqiu/CLongEval"><img src="https://img.shields.io/github/stars/zexuanqiu/CLongEval" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2404.02060"><strong>Long-context LLMs Struggle with Long In-context Learning.</strong></a> <em>Tianle Li, Ge Zhang, Quy Duc Do, Xiang Yue, Wenhu Chen.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/TIGER-AI-Lab/LongICLBench"><img src="https://img.shields.io/github/stars/TIGER-AI-Lab/LongICLBench" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2404.02103"><strong>CLAPNQ: Cohesive Long-form Answers from Passages in Natural Questions for RAG systems.</strong></a> <em>Sara Rosenthal, Avirup Sil, Radu Florian, Salim Roukos.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/primeqa/clapnq"><img src="https://img.shields.io/github/stars/primeqa/clapnq" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2404.05446"><strong>XL2Bench: A Benchmark for Extremely Long Context Understanding with Long-range Dependencies.</strong></a> <em>Xuanfan Ni, Hengyi Cai, Xiaochi Wei, Shuaiqiang Wang, Dawei Yin, Piji Li.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/nuaa-nlp/XL2Bench"><img src="https://img.shields.io/github/stars/nuaa-nlp/XL2Bench" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://openreview.net/forum?id=PdaPky8MUn"><strong>Never Train from Scratch: Fair Comparison of Long-Sequence Models Requires Data-Driven Priors.</strong></a> <em>Ido Amos, Jonathan Berant, Ankit Gupta.</em> ICLR 2024 Oral.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2404.06480"><strong>Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks.</strong></a> <em>Chonghua Wang, Haodong Duan, Songyang Zhang, Dahua Lin, Kai Chen.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/open-compass/Ada-LEval"><img src="https://img.shields.io/github/stars/open-compass/Ada-LEval" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2404.06654"><strong>RULER: What’s the Real Context Size of Your Long-Context Language Models?.</strong></a> <em>Cheng-Ping Hsieh, Simeng Sun, Samuel Kriman, Shantanu Acharya, Dima Rekesh, Fei Jia, Boris Ginsburg.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/hsiehjackson/RULER"><img src="https://img.shields.io/github/stars/hsiehjackson/RULER" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2404.12096"><strong>LongEmbed: Extending Embedding Models for Long Context Retrieval.</strong></a> <em>Dawei Zhu, Liang Wang, Nan Yang, Yifan Song, Wenhao Wu, Furu Wei, Sujian Li.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/dwzhu-pku/LongEmbed"><img src="https://img.shields.io/github/stars/dwzhu-pku/LongEmbed" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2404.16811"><strong>Make Your LLM Fully Utilize the Context.</strong></a> <em>Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/microsoft/FILM"><img src="https://img.shields.io/github/stars/microsoft/FILM" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2310.15147"><strong>S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models.</strong></a> <em>Fangyu Lei, Qian Liu, Yiming Huang, Shizhu He, Jun Zhao, Kang Liu.</em> NAACL 2024.</li>
</ol>

<p>        <a href="https://github.com/lfy79001/S3Eval"><img src="https://img.shields.io/github/stars/lfy79001/S3Eval" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2405.00200"><strong>In-Context Learning with Long-Context Models: An In-Depth Exploration.</strong></a> <em>Amanda Bertsch, Maor Ivgi, Uri Alon, Jonathan Berant, Matthew R. Gormley, Graham Neubig.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/abertsch72/long-context-icl"><img src="https://img.shields.io/github/stars/abertsch72/long-context-icl" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://www-cdn.anthropic.com/af5633c94ed2beb282f6a53c595eb437e8e7b630/Many_Shot_Jailbreaking__2024_04_02_0936.pdf"><strong>Many-shot Jailbreaking.</strong></a>  Anthropic 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.05938"><strong>DOLOMITES: Domain-Specific Long-Form Methodical Tasks.</strong></a> <em>Chaitanya Malaviya, Priyanka Agrawal, Kuzman Ganchev, Pranesh Srinivasan, Fantine Huot, Jonathan Berant, Mark Yatskar, Dipanjan Das, Mirella Lapata, Chris Alberti.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.08944"><strong>Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis.</strong></a> <em>Yao Fu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.09980"><strong>FinTextQA: A Dataset for Long-form Financial Question Answering.</strong></a> <em>Jian Chen, Peilin Zhou, Yining Hua, Yingxin Loh, Kehui Chen, Ziyuan Li, Bing Zhu, Junwei Liang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.11577"><strong>A Multi-Perspective Analysis of Memorization in Large Language Models.</strong></a> <em>Bowen Chen, Namgi Han, Yusuke Miyao.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.12701"><strong>OLAPH: Improving Factuality in Biomedical Long-form Question Answering.</strong></a> <em>Minbyul Jeong, Hyeon Hwang, Chanwoong Yoon, Taewhoo Lee, Jaewoo Kang.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/dmis-lab/OLAPH"><img src="https://img.shields.io/github/stars/dmis-lab/OLAPH" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2405.14804"><strong>Can LLMs Solve longer Math Word Problems Better?.</strong></a> <em>Xin Xu, Tong Xiao, Zitong Chao, Zhenya Huang, Can Yang, Yang Wang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/XinXU-USTC/CoLeG-Math"><img src="https://img.shields.io/github/stars/XinXU-USTC/CoLeG-Math" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2405.14591"><strong>Base of RoPE Bounds Context Length.</strong></a> <em>Xin Men, Mingyu Xu, Bingning Wang, Qingyu Zhang, Hongyu Lin, Xianpei Han, Weipeng Chen.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2404.11018"><strong>Many-shot In-Context Learning.</strong></a> <em>Rishabh Agarwal, Avi Singh, Lei M. Zhang, Bernd Bohnet, Luis Rosias, Stephanie Chan, Biao Zhang, Ankesh Anand, Zaheer Abbas, Azade Nova, John D. Co-Reyes, Eric Chu, Feryal Behbahani, Aleksandra Faust, Hugo Larochelle.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2405.17915"><strong>Long Context is Not Long at All: A Prospector of Long-Dependency Data for Large Language Models.</strong></a> <em>Longze Chen, Ziqiang Liu, Wanwei He, Yunshui Li, Run Luo, Min Yang.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/October2001/ProLong"><img src="https://img.shields.io/github/stars/October2001/ProLong" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2405.20131"><strong>Language Models Need Inductive Biases to Count Inductively.</strong></a> <em>Yingshan Chang, Yonatan Bisk.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/zdxdsw/inductive_counting_with_LMs"><img src="https://img.shields.io/github/stars/zdxdsw/inductive_counting_with_LMs" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2406.02472"><strong>Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding.</strong></a> <em>Zhihan Zhang, Yixin Cao, Chenchen Ye, Yunshan Ma, Lizi Liao, Tat-Seng Chua.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.04744"><strong>CRAG – Comprehensive RAG Benchmark.</strong></a> <em>Xiao Yang, Kai Sun, Hao Xin, Yushi Sun, Nikita Bhalla, Xiangsen Chen, Sajal Choudhary, Rongze Daniel Gui, Ziran Will Jiang, Ziyu Jiang, Lingkun Kong, Brian Moran, Jiaqi Wang, Yifan Ethan Xu, An Yan, Chenyu Yang, Eting Yuan, Hanwen Zha, Nan Tang, Lei Chen, Nicolas Scheffer, Yue Liu, Nirav Shah, Rakesh Wanga, Anuj Kumar, Wen-tau Yih, Xin Luna Dong.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://www.aicrowd.com/challenges/meta-comprehensive-rag-benchmark-kdd-cup-2024"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.07887"><strong>An Empirical Study of Mamba-based Language Models.</strong></a> <em>Roger Waleffe, Wonmin Byeon, Duncan Riach, Brandon Norick, Vijay Korthikanti, Tri Dao, Albert Gu, Ali Hatamizadeh, Sudhakar Singh, Deepak Narayanan, Garvit Kulshreshtha, Vartika Singh, Jared Casper, Jan Kautz, Mohammad Shoeybi, Bryan Catanzaro.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/NVIDIA/Megatron-LM/tree/ssm/examples/mamba"><img src="https://img.shields.io/github/stars/NVIDIA/Megatron-LM" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.10149"><strong>BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack.</strong></a> <em>Yuri Kuratov, Aydar Bulatov, Petr Anokhin, Ivan Rodkin, Dmitry Sorokin, Artyom Sorokin, Mikhail Burtsev.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/booydar/babilong"><img src="https://img.shields.io/github/stars/booydar/babilong" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.11629"><strong>Can Many-Shot In-Context Learning Help Long-Context LLM Judges? See More, Judge Better!.</strong></a> <em>Mingyang Song, Mao Zheng, Xuan Luo.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/nick7nlp/SeeMoreJudgeBetter"><img src="https://img.shields.io/github/stars/nick7nlp/SeeMoreJudgeBetter" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2406.11238"><strong>What Kinds of Tokens Benefit from Distant Text? An Analysis on Long Context Language Modeling.</strong></a> <em>Yutong Hu, Quzhe Huang, Kangcheng Luo, Yansong Feng.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.13282"><strong>Understanding the RoPE Extensions of Long-Context LLMs: An Attention Perspective.</strong></a> <em>Meizhi Zhong, Chen Zhang, Yikun Lei, Xikai Liu, Yan Gao, Yao Hu, Kehai Chen, Min Zhang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.13121"><strong>Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?.</strong></a> <em>Jinhyuk Lee, Anthony Chen, Zhuyun Dai, Dheeru Dua, Devendra Singh Sachan, Michael Boratko, Yi Luan, Sébastien M. R. Arnold, Vincent Perot, Siddharth Dalmia, Hexiang Hu, Xudong Lin, Panupong Pasupat, Aida Amini, Jeremy R. Cole, Sebastian Riedel, Iftekhar Naim, Ming-Wei Chang, Kelvin Guu.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/google-deepmind/loft"><img src="https://img.shields.io/github/stars/google-deepmind/loft" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.14673"><strong>Insights into LLM Long-Context Failures: When Transformers Know but Don’t Tell.</strong></a> <em>Taiming Lu, Muhan Gao, Kuai Yu, Adam Byerly, Daniel Khashabi.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/TaiMingLu/know-dont-tell"><img src="https://img.shields.io/github/stars/TaiMingLu/know-dont-tell" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.15019"><strong>MedOdyssey: A Medical Domain Benchmark for Long Context Evaluation Up to 200K Tokens.</strong></a> <em>Yongqi Fan, Hongli Sun, Kui Xue, Xiaofan Zhang, Shaoting Zhang, Tong Ruan.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/JOHNNY-fans/MedOdyssey"><img src="https://img.shields.io/github/stars/JOHNNY-fans/MedOdyssey" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2406.16833"><strong>USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$onversations.</strong></a> <em>Mounika Marreddy, Subba Reddy Oota, Venkata Charan Chinni, Manish Gupta, Lucie Flek.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.16008"><strong>Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization.</strong></a> <em>Cheng-Yu Hsieh, Yung-Sung Chuang, Chun-Liang Li, Zifeng Wang, Long T. Le, Abhishek Kumar, James Glass, Alexander Ratner, Chen-Yu Lee, Ranjay Krishna, Tomas Pfister.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.16264"><strong>One Thousand and One Pairs: A “novel” challenge for long-context language models.</strong></a> <em>Marzena Karpinska, Katherine Thai, Kyle Lo, Tanya Goyal, Mohit Iyyer.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/marzenakrp/nocha/"><img src="https://img.shields.io/github/stars/marzenakrp/nocha" alt="GitHub Repo stars" /></a>
        <a href="https://novelchallenge.github.io/"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2406.17588"><strong>LongIns: A Challenging Long-context Instruction-based Exam for LLMs.</strong></a> <em>Shawn Gavin, Tuney Zheng, Jiaheng Liu, Quehry Que, Noah Wang, Jian Yang, Chenchen Zhang, Wenhao Huang, Wenhu Chen, Ge Zhang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.17419"><strong>Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA.</strong></a> <em>Minzheng Wang, Longze Chen, Cheng Fu, Shengyi Liao, Xinghua Zhang, Bingli Wu, Haiyang Yu, Nan Xu, Lei Zhang, Run Luo, Yunshui Li, Min Yang, Fei Huang, Yongbin Li.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/MozerWang/Loong"><img src="https://img.shields.io/github/stars/MozerWang/Loong" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.19276"><strong>VERISCORE: Evaluating the factuality of verifiable claims in long-form text generation.</strong></a> <em>Yixiao Song, Yekyung Kim, Mohit Iyyer.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/Yixiao-Song/VeriScore"><img src="https://img.shields.io/github/stars/Yixiao-Song/VeriScore" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.20015"><strong>ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models.</strong></a> <em>Yuxiang Zhang, Jing Chen, Junjie Wang, Yaxin Liu, Cheng Yang, Chufan Shi, Xinyu Zhu, Zihao Lin, Hanwen Wan, Yujiu Yang, Tetsuya Sakai, Tian Feng, Hayato Yamana.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/ToolBeHonest/ToolBeHonest"><img src="https://img.shields.io/github/stars/ToolBeHonest/ToolBeHonest" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2407.01527"><strong>KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches.</strong></a> <em>Jiayi Yuan, Hongyi Liu, Shaochen (Henry)Zhong, Yu-Neng Chuang, Songchen Li, Guanchu Wang, Duy Le, Hongye Jin, Vipin Chaudhary, Zhaozhuo Xu, Zirui Liu, Xia Hu.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/henryzhongsc/longctx_bench"><img src="https://img.shields.io/github/stars/henryzhongsc/longctx_bench" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2407.00402"><strong>Is It Really Long Context if All You Need Is Retrieval? Towards Genuinely Difficult Long Context NLP.</strong></a> <em>Omer Goldman, Alon Jacovi, Aviv Slobodkin, Aviya Maimon, Ido Dagan, Reut Tsarfaty.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.01370"><strong>Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems.</strong></a> <em>Philippe Laban, Alexander R. Fabbri, Caiming Xiong, Chien-Sheng Wu.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/salesforce/summary-of-a-haystack"><img src="https://img.shields.io/github/stars/salesforce/summary-of-a-haystack" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2407.03916"><strong>Entity-Level Sentiment: More than the Sum of Its Parts.</strong></a> <em>Egil Rønningstad, Roman Klinger, Erik Velldal, Lilja Øvrelid.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.03651"><strong>Evaluating Language Model Context Windows: A “Working Memory” Test and Inference-time Correction.</strong></a> <em>Amanda Dsouza, Christopher Glaze, Changho Shin, Frederic Sala.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.07321"><strong>RAG vs. Long Context: Examining Frontier Large Language Models for Environmental Review Document Comprehension.</strong></a> <em>Hung Phan, Anurag Acharya, Sarthak Chaturvedi, Shivam Sharma, Mike Parker, Dan Nally, Ali Jannesari, Karl Pazdernik, Mahantesh Halappanavar, Sai Munikoti, Sameera Horawalavithana.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.07799"><strong>Attribute or Abstain: Large Language Models as Long Document Assistants.</strong></a> <em>Jan Buchmann, Xiao Liu, Iryna Gurevych.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/UKPLab/arxiv2024-attribute-or-abstain"><img src="https://img.shields.io/github/stars/UKPLab/arxiv2024-attribute-or-abstain" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2407.08112"><strong>How Well Can a Long Sequence Model Model Long Sequences? Comparing Architechtural Inductive Biases on Long-Context Abilities.</strong></a> <em>Jerry Huang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.10701"><strong>DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems.</strong></a> <em>Anni Zou, Wenhao Yu, Hongming Zhang, Kaixin Ma, Deng Cai, Zhuosheng Zhang, Hai Zhao, Dong Yu.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/Anni-Zou/DocBench"><img src="https://img.shields.io/github/stars/Anni-Zou/DocBench" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2407.11963"><strong>NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?.</strong></a> <em>Mo Li, Songyang Zhang, Yunxin Liu, Kai Chen.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/open-compass/opencompass"><img src="https://img.shields.io/github/stars/open-compass/opencompass" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2407.11016"><strong>LongLaMP: A Benchmark for Personalized Long-form Text Generation.</strong></a> <em>Ishita Kumar, Snigdha Viswanathan, Sushrita Yerra, Alireza Salemi, Ryan A. Rossi, Franck Dernoncourt, Hanieh Deilamsalehy, Xiang Chen, Ruiyi Zhang, Shubham Agarwal, Nedim Lipka, Hamed Zamani.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://longlamp-benchmark.github.io/"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2407.13998"><strong>RAG-QA Arena: Evaluating Domain Robustness for Long-form Retrieval Augmented Question Answering.</strong></a> <em>Rujun Han, Yuhao Zhang, Peng Qi, Yumo Xu, Jenyuan Wang, Lan Liu, William Yang Wang, Bonan Min, Vittorio Castelli.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/awslabs/rag-qa-arena"><img src="https://img.shields.io/github/stars/awslabs/rag-qa-arena" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2407.15516"><strong>Attention Is All You Need But You Don’t Need All Of It For Inference of Large Language Models.</strong></a> <em>Georgy Tyukin, Gbetondji J-S Dovonon, Jean Kaddour, Pasquale Minervini.</em> ICML 2024 TF2M workshop.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.16695"><strong>Stress-Testing Long-Context Language Models with Lifelong ICL and Task Haystack.</strong></a> <em>Xiaoyue Xu, Qinyuan Ye, Xiang Ren.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/INK-USC/Lifelong-ICL"><img src="https://img.shields.io/github/stars/INK-USC/Lifelong-ICL" alt="GitHub Repo stars" /></a>
        <a href="https://inklab.usc.edu/lifelong-icl/"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2407.17468"><strong>WildHallucinations: Evaluating Long-form Factuality in LLMs with Real-World Entity Queries.</strong></a> <em>Wenting Zhao, Tanya Goyal, Yu Ying Chiu, Liwei Jiang, Benjamin Newman, Abhilasha Ravichander, Khyathi Chandu, Ronan Le Bras, Claire Cardie, Yuntian Deng, Yejin Choi.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.16833"><strong>Retrieval Augmented Generation or Long-Context LLMs? A Comprehensive Study and Hybrid Approach.</strong></a> <em>Zhuowan Li, Cheng Li, Mingyang Zhang, Qiaozhu Mei, Michael Bendersky.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.21049"><strong>Evaluating Long Range Dependency Handling in Code Generation Models using Multi-Step Key Retrieval.</strong></a> <em>Yannick Assogba, Donghao Ren.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/apple/ml-key-retrieval-code-tasks"><img src="https://img.shields.io/github/stars/apple/ml-key-retrieval-code-tasks" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2408.02439"><strong>Long Input Benchmark for Russian Analysis.</strong></a> <em>Igor Churin, Murat Apishev, Maria Tikhonova, Denis Shevelev, Aydar Bulatov, Yuri Kuratov, Sergej Averkiev, Alena Fenogenova.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2408.03325"><strong>CoverBench: A Challenging Benchmark for Complex Claim Verification.</strong></a> <em>Alon Jacovi, Moran Ambar, Eyal Ben-David, Uri Shaham, Amir Feder, Mor Geva, Dror Marcus, Avi Caciularu.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://huggingface.co/datasets/google/coverbench"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2408.07055"><strong>LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs.</strong></a> <em>Yushi Bai, Jiajie Zhang, Xin Lv, Linzhi Zheng, Siqi Zhu, Lei Hou, Yuxiao Dong, Jie Tang, Juanzi Li.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/THUDM/LongWriter"><img src="https://img.shields.io/github/stars/THUDM/LongWriter" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2408.10151"><strong>Multilingual Needle in a Haystack: Investigating Long-Context Behavior of Multilingual Large Language Models.</strong></a> <em>Amey Hengle, Prasoon Bajpai, Soham Dan, Tanmoy Chakraborty.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/AmeyHengle/multilingual-needle-in-a-haystack"><img src="https://img.shields.io/github/stars/AmeyHengle/multilingual-needle-in-a-haystack" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2409.02076"><strong>LongGenBench: Benchmarking Long-Form Generation in Long Context LLMs.</strong></a> <em>Yuhao Wu, Ming Shan Hee, Zhiqing Hu, Roy Ka-Wei Lee.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/mozhu621/LongGenBench/"><img src="https://img.shields.io/github/stars/mozhu621/LongGenBench" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2409.01893"><strong>What are the Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets? Insights and Best Practices.</strong></a> <em>Zhi Chen, Qiguang Chen, Libo Qin, Qipeng Guo, Haijun Lv, Yicheng Zou, Wanxiang Che, Hang Yan, Kai Chen, Dahua Lin.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/WowCZ/LongMIT"><img src="https://img.shields.io/github/stars/WowCZ/LongMIT" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2409.06338"><strong>Retrieval Or Holistic Understanding? Dolce: Differentiate Our Long Context Evaluation Tasks.</strong></a> <em>Zi Yang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2409.12181"><strong>A Controlled Study on Long Context Extension and Generalization in LLMs.</strong></a> <em>Yi Lu, Jing Nathan Yan, Songlin Yang, Justin T. Chiu, Siyu Ren, Fei Yuan, Wenting Zhao, Zhiyong Wu, Alexander M. Rush.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/Leooyii/LCEG"><img src="https://img.shields.io/github/stars/Leooyii/LCEG" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2409.12558"><strong>RAD-Bench: Evaluating Large Language Models Capabilities in Retrieval Augmented Dialogues.</strong></a> <em>Tzu-Lin Kuo, Feng-Ting Liao, Mu-Wei Hsieh, Fu-Chieh Chang, Po-Chun Hsu, Da-Shan Shiu.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/mtkresearch/RAD-Bench"><img src="https://img.shields.io/github/stars/mtkresearch/RAD-Bench" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2409.12941"><strong>Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation.</strong></a> <em>Satyapriya Krishna, Kalpesh Krishna, Anhad Mohananey, Steven Schwarcz, Adam Stambler, Shyam Upadhyay, Manaal Faruqui.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://huggingface.co/datasets/google/frames-benchmark"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2409.12640"><strong>Michelangelo: Long Context Evaluations Beyond Haystacks via Latent Structure Queries.</strong></a> <em>Kiran Vodrahalli, Santiago Ontanon, Nilesh Tripuraneni, Kelvin Xu, Sanil Jain, Rakesh Shivanna, Jeffrey Hui, Nishanth Dikkala, Mehran Kazemi, Bahare Fatemi, Rohan Anil, Ethan Dyer, Siamak Shakeri, Roopali Vij, Harsh Mehta, Vinay Ramasesh, Quoc Le, Ed Chi, Yifeng Lu, Orhan Firat, Angeliki Lazaridou, Jean-Baptiste Lespiau, Nithya Attaluri, Kate Olszewska.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2409.02465"><strong>DetectiveQA: Evaluating Long-Context Reasoning on Detective Novels.</strong></a> <em>Zhe Xu, Jiasheng Ye, Xiangyang Liu, Tianxiang Sun, Xiaoran Liu, Qipeng Guo, Linlin Li, Qun Liu, Xuanjing Huang, Xipeng Qiu.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2409.02897"><strong>LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA.</strong></a> <em>Jiajie Zhang, Yushi Bai, Xin Lv, Wanjun Gu, Danqing Liu, Minhao Zou, Shulin Cao, Lei Hou, Yuxiao Dong, Ling Feng, Juanzi Li.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/THUDM/LongCite"><img src="https://img.shields.io/github/stars/THUDM/LongCite" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2409.16191"><strong>HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models.</strong></a> <em>Haoran Que, Feiyu Duan, Liqun He, Yutao Mou, Wangchunshu Zhou, Jiaheng Liu, Wenge Rong, Zekun Moore Wang, Jian Yang, Ge Zhang, Junran Peng, Zhaoxiang Zhang, Songyang Zhang, Kai Chen.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/Tintri/hello-bench"><img src="https://img.shields.io/github/stars/Tintri/hello-bench" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2409.18006"><strong>Multilingual Evaluation of Long Context Retrieval and Reasoning.</strong></a> <em>Ameeta Agrawal, Andy Dang, Sina Bagheri Nezhad, Rhitabrat Pokharel, Russell Scheinberg.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.02115"><strong>L-CiteEval: Do Long-Context Models Truly Leverage Context for Responding?</strong></a> <em>Zecheng Tang, Keyan Zhou, Juntao Li, Baibei Ji, Jianye Hou, Min Zhang.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/ZetangForward/L-CITEEVAL"><img src="https://img.shields.io/github/stars/ZetangForward/L-CITEEVAL" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2410.02694"><strong>HELMET: How to Evaluate Long-Context Language Models Effectively and Thoroughly.</strong></a> <em>Howard Yen, Tianyu Gao, Minmin Hou, Ke Ding, Daniel Fleischer, Peter Izasak, Moshe Wasserblat, Danqi Chen.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/princeton-nlp/HELMET"><img src="https://img.shields.io/github/stars/princeton-nlp/HELMET" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2410.04698"><strong>MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs.</strong></a> <em>Lei Wang, Shan Dong, Yuhui Xu, Hanze Dong, Yalu Wang, Amrita Saha, Ee-Peng Lim, Caiming Xiong, Doyen Sahoo.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.04199"><strong>LongGenBench: Long-context Generation Benchmark.</strong></a> <em>Xiang Liu, Peijie Dong, Xuming Hu, Xiaowen Chu.</em> EMNLP 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.04422"><strong>Hyper-multi-step: The Truth Behind Difficult Long-context Tasks.</strong></a> <em>Yijiong Yu.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/yuyijiong/hard_retrieval_for_llm"><img src="https://img.shields.io/github/stars/yuyijiong/hard_retrieval_for_llm" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2410.11996"><strong>Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data.</strong></a> <em>Seiji Maekawa, Hayate Iso, Nikita Bhutani.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/megagonlabs/holobench"><img src="https://img.shields.io/github/stars/megagonlabs/holobench" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2410.12292"><strong>How much do contextualized representations encode long-range context?.</strong></a> <em>Simeng Sun, Cheng-Ping Hsieh.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.10813"><strong>LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory.</strong></a> <em>Di Wu, Hongwei Wang, Wenhao Yu, Yuwei Zhang, Kai-Wei Chang, Dong Yu.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/xiaowu0162/LongMemEval"><img src="https://img.shields.io/github/stars/xiaowu0162/LongMemEval" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2410.10781"><strong>When Attention Sink Emerges in Language Models: An Empirical View.</strong></a> <em>Xiangming Gu, Tianyu Pang, Chao Du, Qian Liu, Fengzhuo Zhang, Cunxiao Du, Ye Wang, Min Lin.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/sail-sg/Attention-Sink"><img src="https://img.shields.io/github/stars/sail-sg/Attention-Sink" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2410.10210"><strong>Minimum Tuning to Unlock Long Output from LLMs with High Quality Data as the Key.</strong></a> <em>Yingda Chen, Xingjun Wang, Jintao Huang, Yunlin Mao, Daoze Zhang, Yuze Zhao.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.14641"><strong>Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs.</strong></a> <em>Runchu Tian, Yanghao Li, Yuepeng Fu, Siyang Deng, Qinyu Luo, Cheng Qian, Shuo Wang, Xin Cong, Zhong Zhang, Yesai Wu, Yankai Lin, Huadong Wang, Xiaojiang Liu.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/Rachum-thu/LongPiBench"><img src="https://img.shields.io/github/stars/Rachum-thu/LongPiBench" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2410.16848"><strong>ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage.</strong></a> <em>Taewhoo Lee, Chanwoong Yoon, Kyochul Jang, Donghyeon Lee, Minju Song, Hyunjae Kim, Jaewoo Kang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/dmis-lab/ETHIC"><img src="https://img.shields.io/github/stars/dmis-lab/ETHIC" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2410.23000"><strong>Long2RAG: Evaluating Long-Context &amp; Long-Form Retrieval-Augmented Generation with Key Point Recall.</strong></a> <em>Zehan Qi, Rongwu Xu, Zhijiang Guo, Cunxiang Wang, Hao Zhang, Wei Xu.</em> EMNLP 2024.</li>
</ol>

<h3 id="112-mllm">11.2 MLLM</h3>

<ol>
  <li><a href="https://arxiv.org/abs/2404.18532"><strong>MileBench: Benchmarking MLLMs in Long Context.</strong></a> <em>Dingjie Song, Shunian Chen, Guiming Hardy Chen, Fei Yu, Xiang Wan, Benyou Wang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/MileBench/MileBench"><img src="https://img.shields.io/github/stars/MileBench/MileBench" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2405.09798"><strong>Many-Shot In-Context Learning in Multimodal Foundation Models.</strong></a> <em>Yixing Jiang, Jeremy Irvin, Ji Hun Wang, Muhammad Ahmed Chaudhry, Jonathan H. Chen, Andrew Y. Ng.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/stanfordmlgroup/ManyICL"><img src="https://img.shields.io/github/stars/stanfordmlgroup/ManyICL" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.04264"><strong>MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding.</strong></a> <em>Junjie Zhou, Yan Shu, Bo Zhao, Boya Wu, Shitao Xiao, Xi Yang, Yongping Xiong, Bo Zhang, Tiejun Huang, Zheng Liu.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/FlagOpen/FlagEmbedding/tree/master/MLVU"><img src="https://img.shields.io/github/stars/FlagOpen/FlagEmbedding" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.06025"><strong>RepoQA: Evaluating Long Context Code Understanding.</strong></a> <em>Jiawei Liu, Jia Le Tian, Vijay Daita, Yuxiang Wei, Yifeng Ding, Yuhan Katherine Wang, Jun Yang, Lingming Zhang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/evalplus/repoqa"><img src="https://img.shields.io/github/stars/evalplus/repoqa" alt="GitHub Repo stars" /></a>
        <a href="https://evalplus.github.io/repoqa.html"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.10221"><strong>Short Film Dataset (SFD): A Benchmark for Story-Level Video Understanding.</strong></a> <em>Ridouane Ghermi, Xi Wang, Vicky Kalogeiton, Ivan Laptev.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/shortfilmdataset/ShortFilmDataset"><img src="https://img.shields.io/github/stars/shortfilmdataset/ShortFilmDataset" alt="GitHub Repo stars" /></a>
        <a href="https://shortfilmdataset.github.io/"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.11230"><strong>Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language Models.</strong></a> <em>Hengyi Wang, Haizhou Shi, Shiwei Tan, Weiyi Qin, Wenyuan Wang, Tunyu Zhang, Akshay Nambi, Tanuja Ganu, Hao Wang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/Wang-ML-Lab/multimodal-needle-in-a-haystack"><img src="https://img.shields.io/github/stars/Wang-ML-Lab/multimodal-needle-in-a-haystack" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2406.16851"><strong>Losing Visual Needles in Image Haystacks: Vision Language Models are Easily Distracted in Short and Long Contexts.</strong></a> <em>Aditya Sharma, Michael Saxon, William Yang Wang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://locovqa.github.io/"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2407.01523"><strong>MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations.</strong></a> <em>Yubo Ma, Yuhang Zang, Liangyu Chen, Meiqi Chen, Yizhu Jiao, Xinze Li, Xinyuan Lu, Ziyu Liu, Yan Ma, Xiaoyi Dong, Pan Zhang, Liangming Pan, Yu-Gang Jiang, Jiaqi Wang, Yixin Cao, Aixin Sun.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/mayubo2333/MMLongBench-Doc"><img src="https://img.shields.io/github/stars/mayubo2333/MMLongBench-Doc" alt="GitHub Repo stars" /></a>
        <a href="https://mayubo2333.github.io/MMLongBench-Doc/"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2407.03320"><strong>InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.</strong></a> <em>Pan Zhang, Xiaoyi Dong, Yuhang Zang, Yuhang Cao, Rui Qian, Lin Chen, Qipeng Guo, Haodong Duan, Bin Wang, Linke Ouyang, Songyang Zhang, Wenwei Zhang, Yining Li, Yang Gao, Peng Sun, Xinyue Zhang, Wei Li, Jingwen Li, Wenhai Wang, Hang Yan, Conghui He, Xingcheng Zhang, Kai Chen, Jifeng Dai, Yu Qiao, Dahua Lin, Jiaqi Wang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/InternLM/InternLM-XComposer"><img src="https://img.shields.io/github/stars/InternLM/InternLM-XComposer" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2407.03958"><strong>Stark: Social Long-Term Multi-Modal Conversation with Persona Commonsense Knowledge.</strong></a> <em>Young-Jun Lee, Dokyong Lee, Junyoung Youn, Kyeongjin Oh, Byungsoo Ko, Jonghwan Hyeon, Ho-Jin Choi.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/passing2961/Stark"><img src="https://img.shields.io/github/stars/passing2961/Stark" alt="GitHub Repo stars" /></a>
        <a href="https://stark-dataset.github.io/"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2407.09413"><strong>SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers.</strong></a> <em>Shraman Pramanick, Rama Chellappa, Subhashini Venugopalan.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2407.15754"><strong>LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding.</strong></a> <em>Haoning Wu, Dongxu Li, Bei Chen, Junnan Li.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/longvideobench/LongVideoBench"><img src="https://img.shields.io/github/stars/longvideobench/LongVideoBench" alt="GitHub Repo stars" /></a>
        <a href="https://longvideobench.github.io/"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2407.19669"><strong>mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval.</strong></a> <em>Xin Zhang, Yanzhao Zhang, Dingkun Long, Wen Xie, Ziqi Dai, Jialong Tang, Huan Lin, Baosong Yang, Pengjun Xie, Fei Huang, Meishan Zhang, Wenjie Li, Min Zhang.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://huggingface.co/Alibaba-NLP/gte-multilingual-base"><img src="https://img.shields.io/badge/Homepage-blue" alt="Static Badge" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2408.06281"><strong>MovieSum: An Abstractive Summarization Dataset for Movie Screenplays.</strong></a> <em>Rohit Saxena, Frank Keller.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/saxenarohit/MovieSum"><img src="https://img.shields.io/github/stars/saxenarohit/MovieSum" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2407.08683"><strong>SEED-Story: Multimodal Long Story Generation with Large Language Model.</strong></a> <em>Shuai Yang, Yuying Ge, Yang Li, Yukang Chen, Yixiao Ge, Ying Shan, Yingcong Chen.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/TencentARC/SEED-Story"><img src="https://img.shields.io/github/stars/TencentARC/SEED-Story" alt="GitHub Repo stars" /></a></p>

<h2 id="12-long-text-generation">12. Long Text Generation</h2>

<ol>
  <li>
    <p><a href="https://arxiv.org/abs/2410.06203"><strong>Integrating Planning into Single-Turn Long-Form Text Generation.</strong></a> <em>Yi Liang, You Wu, Honglei Zhuang, Li Chen, Jiaming Shen, Yiling Jia, Zhen Qin, Sumit Sanghai, Xuanhui Wang, Carl Yang, Michael Bendersky.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.10210"><strong>Minimum Tuning to Unlock Long Output from LLMs with High Quality Data as the Key.</strong></a> <em>Yingda Chen, Xingjun Wang, Jintao Huang, Yunlin Mao, Daoze Zhang, Yuze Zhao.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.04199"><strong>LongGenBench: Long-context Generation Benchmark.</strong></a> <em>Xiang Liu, Peijie Dong, Xuming Hu, Xiaowen Chu.</em> EMNLP 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.14309"><strong>LoGU: Long-form Generation with Uncertainty Expressions.</strong></a> <em>Ruihan Yang, Caiqi Zhang, Zhisong Zhang, Xinting Huang, Sen Yang, Nigel Collier, Dong Yu, Deqing Yang.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2410.17519"><strong>Large Language Models Still Exhibit Bias in Long Text.</strong></a> <em>Wonje Jeung, Dongjae Jeon, Ashkan Yousefpour, Jonghyun Choi.</em> Arxiv 2024.</p>
  </li>
  <li>
    <p><a href="https://arxiv.org/abs/2406.19371"><strong>Suri: Multi-constraint Instruction Following for Long-form Text Generation.</strong></a> <em>Chau Minh Pham, Simeng Sun, Mohit Iyyer.</em> Arxiv 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/chtmp223/suri"><img src="https://img.shields.io/github/stars/chtmp223/suri" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2408.07055"><strong>LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs.</strong></a> <em>Yushi Bai, Jiajie Zhang, Xin Lv, Linzhi Zheng, Siqi Zhu, Lei Hou, Yuxiao Dong, Jie Tang, Juanzi Li.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/THUDM/LongWriter"><img src="https://img.shields.io/github/stars/THUDM/LongWriter" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://arxiv.org/abs/2410.23933"><strong>Language Models can Self-Lengthen to Generate Long Texts.</strong></a> <em>Shanghaoran Quan, Tianyi Tang, Bowen Yu, An Yang, Dayiheng Liu, Bofei Gao, Jianhong Tu, Yichang Zhang, Jingren Zhou, Junyang Lin.</em> Arxiv 2024.</li>
</ol>

<p>        <a href="https://github.com/QwenLM/Self-Lengthen"><img src="https://img.shields.io/github/stars/QwenLM/Self-Lengthen" alt="GitHub Repo stars" /></a></p>

<h2 id="13-blogs">13. Blogs</h2>

<ol>
  <li>
    <p><a href="https://kaiokendev.github.io/context"><strong>Extending Context is Hard…but not Impossible†.</strong></a> <em>kaiokendev.</em> 2023.</p>
  </li>
  <li>
    <p><a href="https://www.reddit.com/r/LocalLLaMA/comments/14lz7j5/ntkaware_scaled_rope_allows_llama_models_to_have/"><strong>NTK-Aware Scaled RoPE.</strong></a> <em>u/bloc97
.</em> 2023.</p>
  </li>
  <li>
    <p><a href="https://blog.gopenai.com/how-to-speed-up-llms-and-use-100k-context-window-all-tricks-in-one-place-ffd40577b4c"><strong>The Secret Sauce behind 100K context window in LLMs: all tricks in one place.</strong></a> <em>Galina Alperovich.</em> 2023.</p>
  </li>
  <li>
    <p><a href="https://kexue.fm/archives/9431"><strong>Transformer升级之路：7、长度外推性与局部注意力.</strong></a> <em>苏剑林(Jianlin Su).</em> 2023.</p>
  </li>
  <li>
    <p><a href="https://kexue.fm/archives/9603"><strong>Transformer升级之路：9、一种全局长度外推的新思路.</strong></a> <em>苏剑林(Jianlin Su).</em> 2023.</p>
  </li>
  <li>
    <p><a href="https://kexue.fm/archives/9708"><strong>Transformer升级之路：12、无限外推的ReRoPE.</strong></a> <em>苏剑林(Jianlin Su).</em> 2023.</p>
  </li>
  <li>
    <p><a href="https://kexue.fm/archives/9731"><strong>Transformer升级之路：14、当HWFA遇见ReRoPE.</strong></a> <em>苏剑林(Jianlin Su).</em> 2023.</p>
  </li>
  <li>
    <p><a href="https://kexue.fm/archives/9859"><strong>Transformer升级之路：15、Key归一化助力长度外推.</strong></a> <em>苏剑林(Jianlin Su).</em> 2023.</p>
  </li>
  <li>
    <p><a href="https://kexue.fm/archives/9948"><strong>Transformer升级之路：16、“复盘”长度外推技术.</strong></a> <em>苏剑林(Jianlin Su).</em> 2024.</p>
  </li>
  <li>
    <p><a href="https://contextual.ai/introducing-rag2/"><strong>Introducing RAG 2.0.</strong></a> <em>Contextual AI Team.</em> 2024.</p>
  </li>
  <li>
    <p><a href="https://yaofu.notion.site/How-Do-Language-Models-put-Attention-Weights-over-Long-Context-10250219d5ce42e8b465087c383a034e"><strong>How Do Language Models put Attention Weights over Long Context?.</strong></a> <em>Yao Fu.</em> 2024.</p>
  </li>
  <li>
    <p><a href="https://openrag.notion.site/Open-RAG-c41b2a4dcdea4527a7c1cd998e763595"><strong>An open-source and open-access RAG platform.</strong></a> <em>Yunfan Gao.</em> 2024.</p>
  </li>
  <li>
    <p><a href="https://www.anthropic.com/research/many-shot-jailbreaking"><strong>Many-shot Jailbreaking.</strong></a> <em>Anthropic.</em> 2024.</p>
  </li>
  <li>
    <p><a href="https://yaofu.notion.site/Full-Stack-Transformer-Inference-Optimization-Season-2-Deploying-Long-Context-Models-ee25d3a77ba14f73b8ae19147f77d5e2"><strong>Full Stack Transformer Inference Optimization Season 2: Deploying Long-Context Models.</strong></a> <em>Yao Fu.</em> 2024.</p>
  </li>
  <li>
    <p><a href="https://spaces.ac.cn/archives/10091"><strong>缓存与效果的极限拉扯：从MHA、MQA、GQA到MLA.</strong></a> <em>苏剑林(Jianlin Su).</em> 2024.</p>
  </li>
  <li>
    <p><a href="https://yaofu.notion.site/Towards-100x-Speedup-Full-Stack-Transformer-Inference-Optimization-43124c3688e14cffaf2f1d6cbdf26c6c"><strong>Towards 100x Speedup: Full Stack Transformer Inference Optimization.</strong></a> <em>Yao Fu.</em> 2024.</p>
  </li>
  <li>
    <p><a href="https://zhuanlan.zhihu.com/p/699926343"><strong>2024.5 A Side-by-Side Comparison of the Long Context of Various LLMs (128k articles).</strong></a> <em>SomeoneKong.</em> 2024.</p>
  </li>
</ol>

<p>        <a href="https://github.com/SomeoneKong/llm_long_context_bench202405"><img src="https://img.shields.io/github/stars/SomeoneKong/llm_long_context_bench202405" alt="GitHub Repo stars" /></a></p>

<ol>
  <li><a href="https://zhuanlan.zhihu.com/p/700378183"><strong>2024.5 A Side-by-Side Comparison of the Long Context of Various LLMs (32k articles).</strong></a> <em>SomeoneKong.</em> 2024.</li>
</ol>

<p>        <a href="https://github.com/SomeoneKong/llm_long_context_bench202405"><img src="https://img.shields.io/github/stars/SomeoneKong/llm_long_context_bench202405" alt="GitHub Repo stars" /></a></p>

<ol>
  <li>
    <p><a href="https://kexue.fm/archives/10122"><strong>Transformer升级之路：18、RoPE的底数设计原则.</strong></a> <em>苏剑林(Jianlin Su).</em> 2024.</p>
  </li>
  <li>
    <p><a href="https://qwenlm.github.io/zh/blog/qwen-agent-2405/"><strong>Generalizing an LLM from 8k to 1M Context using Qwen-Agent.</strong></a> <em>Qwen Team.</em> 2024.</p>
  </li>
  <li>
    <p><a href="https://tridao.me/blog/2024/flash3/"><strong>FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision.</strong></a> <em>Jay Shah, Ganesh Bikshandi, Ying Zhang, Vijay Thakkar, Pradeep Ramani, Tri Dao.</em> 2024.</p>
  </li>
</ol>

<h2 id="acknowledgements">Acknowledgements</h2>
<p>Please contact me if I miss your names in the list, I will add you back ASAP!</p>]]></content><author><name>Park Joon</name></author><category term="RESEARCH" /><category term="research" /><category term="llm" /><category term="ai" /><category term="long" /><category term="context" /><summary type="html"><![CDATA[Large Language Model Based Long Context Modeling Papers and Blogs]]></summary></entry><entry><title type="html">Llama-3.2-11b&amp;amp;90b-vision-instruct 모델 튜토리얼 코드</title><link href="https://joonlab.github.io/%EC%BD%94%EB%94%A9/llm/llama_3_2_11b&90b-vision-instruct-tutorial/" rel="alternate" type="text/html" title="Llama-3.2-11b&amp;amp;90b-vision-instruct 모델 튜토리얼 코드" /><published>2024-11-07T00:00:00+09:00</published><updated>2024-11-07T00:00:00+09:00</updated><id>https://joonlab.github.io/%EC%BD%94%EB%94%A9/llm/llama_3_2_11b&amp;90b-vision-instruct-tutorial</id><content type="html" xml:base="https://joonlab.github.io/%EC%BD%94%EB%94%A9/llm/llama_3_2_11b&amp;90b-vision-instruct-tutorial/"><![CDATA[<h2 id="ipynb-to-md-변환">.ipynb to .md 변환</h2>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">!</span>jupyter nbconvert <span class="nt">--to</span> markdown <span class="s2">"/content/Anthropic(Claude)_PDF_processing&amp;Token_counting_tutorial_by_PARK_JOON.ipynb"</span>
</code></pre></div></div>

<h2 id="made-by-park-joon"><a href="https://bio.link/joonpark">Made by PARK JOON</a></h2>

<hr />

<h2 id="참고-링크">참고 링크</h2>

<ul>
  <li><a href="https://build.nvidia.com/meta/llama-3.2-11b-vision-instruct">llama-3.2-11b-vision-instruct</a></li>
  <li><a href="https://build.nvidia.com/meta/llama-3.2-90b-vision-instruct">llama-3.2-90b-vision-instruct</a></li>
</ul>

<hr />

<h2 id="nvidia-api-key-발급-방법">NVIDIA API KEY 발급 방법</h2>

<h3 id="위의-참고-링크-중-하나에-접속한-뒤-다음-이미지-순서대로-참고하여-api-key-발급받기">위의 참고 링크 중 하나에 접속한 뒤, 다음 이미지 순서대로 참고하여 API KEY 발급받기</h3>

<p><img src="https://i.ibb.co/D1YmmP9/Clean-Shot-2024-11-07-at-19-41-35-2x.png" alt="이미지1" /></p>

<p><img src="https://i.ibb.co/C0gMhR8/2.png" alt="이미지2" /></p>

<p><img src="https://i.ibb.co/TKVQy1B/3.png" alt="이미지3" /></p>

<p><img src="https://i.ibb.co/8stJ40v/4.png" alt="이미지4" /></p>

<hr />

<h2 id="api-key-설정">API KEY 설정</h2>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">google.colab</span> <span class="kn">import</span> <span class="n">userdata</span>

<span class="n">NVIDIA_API_KEY</span> <span class="o">=</span> <span class="n">userdata</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">'NVIDIA_API_KEY'</span><span class="p">)</span>
</code></pre></div></div>

<h3 id="참고-이미지">참고 이미지</h3>

<p><img src="https://i.imghippo.com/files/nkS8F1729430186.png" alt="코랩 환경 변수 설정 참고 이미지" /></p>

<p>⚠️ 참고 이미지에서는 OpenAI API KEY를 예시로 들어 설명하고 있습니다.</p>

<p>2️⃣   의 “OPENAI_API_KEY” 를 “NVIDIA_API_KEY” 로 변경해주세요.</p>

<hr />

<h2 id="튜토리얼-ocr">튜토리얼 (OCR)</h2>

<h3 id="11b--파일-업로드">11B &amp; 파일 업로드</h3>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">requests</span>
<span class="kn">import</span> <span class="nn">base64</span>
<span class="kn">import</span> <span class="nn">httpx</span>
<span class="kn">from</span> <span class="nn">google.colab</span> <span class="kn">import</span> <span class="n">files</span>

<span class="k">def</span> <span class="nf">get_image_text_from_base64</span><span class="p">(</span><span class="n">image_b64</span><span class="p">,</span> <span class="n">model_name</span><span class="p">):</span>
    <span class="s">"""
    base64 인코딩된 이미지에서 텍스트 추출함
    """</span>
    <span class="k">assert</span> <span class="nb">len</span><span class="p">(</span><span class="n">image_b64</span><span class="p">)</span> <span class="o">&lt;</span> <span class="mi">180_000</span><span class="p">,</span> <span class="s">"이미지가 너무 큼. assets API 사용 필요함"</span>

    <span class="n">headers</span> <span class="o">=</span> <span class="p">{</span>
        <span class="s">"Authorization"</span><span class="p">:</span> <span class="sa">f</span><span class="s">"Bearer </span><span class="si">{</span><span class="n">NVIDIA_API_KEY</span><span class="si">}</span><span class="s">"</span><span class="p">,</span>
        <span class="s">"Accept"</span><span class="p">:</span> <span class="s">"application/json"</span>
    <span class="p">}</span>

    <span class="n">invoke_url</span> <span class="o">=</span> <span class="sa">f</span><span class="s">"https://ai.api.nvidia.com/v1/gr/</span><span class="si">{</span><span class="n">model_name</span><span class="si">}</span><span class="s">/chat/completions"</span>

    <span class="n">prompt</span> <span class="o">=</span> <span class="s">"""업로드된 이미지 파일에서 텍스트만 추출(OCR)하고, !!!추출된 텍스트 외에는 아무말도 하지 말 것!!!
    즉, "업로드된 이미지 파일에서 추출한 텍스트는 다음과 같습니다."와 같은 쓸데없는 어구는 !!절대!! 포함되면 안 됨.
    """</span>

    <span class="n">payload</span> <span class="o">=</span> <span class="p">{</span>
        <span class="s">"model"</span><span class="p">:</span> <span class="n">model_name</span><span class="p">,</span>
        <span class="s">"messages"</span><span class="p">:</span> <span class="p">[</span>
            <span class="p">{</span>
                <span class="s">"role"</span><span class="p">:</span> <span class="s">"user"</span><span class="p">,</span>
                <span class="s">"content"</span><span class="p">:</span> <span class="sa">f</span><span class="s">'</span><span class="si">{</span><span class="n">prompt</span><span class="si">}</span><span class="s"> &lt;img src="data:image/png;base64,</span><span class="si">{</span><span class="n">image_b64</span><span class="si">}</span><span class="s">" /&gt;'</span>
            <span class="p">}</span>
        <span class="p">],</span>
        <span class="s">"max_tokens"</span><span class="p">:</span> <span class="mi">1024</span><span class="p">,</span>
        <span class="s">"temperature"</span><span class="p">:</span> <span class="mf">0.4</span><span class="p">,</span>
        <span class="s">"top_p"</span><span class="p">:</span> <span class="mf">0.95</span><span class="p">,</span>
        <span class="s">"stream"</span><span class="p">:</span> <span class="bp">False</span>
    <span class="p">}</span>

    <span class="n">response</span> <span class="o">=</span> <span class="n">requests</span><span class="p">.</span><span class="n">post</span><span class="p">(</span><span class="n">invoke_url</span><span class="p">,</span> <span class="n">headers</span><span class="o">=</span><span class="n">headers</span><span class="p">,</span> <span class="n">json</span><span class="o">=</span><span class="n">payload</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">response</span><span class="p">.</span><span class="n">json</span><span class="p">()[</span><span class="s">'choices'</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s">'message'</span><span class="p">][</span><span class="s">'content'</span><span class="p">]</span>

<span class="k">def</span> <span class="nf">process_file_upload</span><span class="p">(</span><span class="n">model_name</span><span class="p">):</span>
    <span class="s">"""
    사용자가 업로드한 파일 처리함
    """</span>
    <span class="k">print</span><span class="p">(</span><span class="s">"파일을 업로드해주세요..."</span><span class="p">)</span>
    <span class="n">uploaded</span> <span class="o">=</span> <span class="n">files</span><span class="p">.</span><span class="n">upload</span><span class="p">()</span>

    <span class="k">for</span> <span class="n">filename</span> <span class="ow">in</span> <span class="n">uploaded</span><span class="p">.</span><span class="n">keys</span><span class="p">():</span>
        <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="s">"rb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
            <span class="n">image_b64</span> <span class="o">=</span> <span class="n">base64</span><span class="p">.</span><span class="n">b64encode</span><span class="p">(</span><span class="n">f</span><span class="p">.</span><span class="n">read</span><span class="p">()).</span><span class="n">decode</span><span class="p">()</span>
        <span class="k">return</span> <span class="n">get_image_text_from_base64</span><span class="p">(</span><span class="n">image_b64</span><span class="p">,</span> <span class="n">model_name</span><span class="p">)</span>

<span class="k">def</span> <span class="nf">process_url_input</span><span class="p">(</span><span class="n">model_name</span><span class="p">):</span>
    <span class="s">"""
    사용자가 입력한 URL 처리함
    """</span>
    <span class="n">url</span> <span class="o">=</span> <span class="nb">input</span><span class="p">(</span><span class="s">"이미지 URL을 입력해주세요: "</span><span class="p">)</span>
    <span class="n">image_data</span> <span class="o">=</span> <span class="n">base64</span><span class="p">.</span><span class="n">b64encode</span><span class="p">(</span><span class="n">httpx</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">).</span><span class="n">content</span><span class="p">).</span><span class="n">decode</span><span class="p">(</span><span class="s">"utf-8"</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">get_image_text_from_base64</span><span class="p">(</span><span class="n">image_data</span><span class="p">,</span> <span class="n">model_name</span><span class="p">)</span>

<span class="n">MODEL_CONFIGS</span> <span class="o">=</span> <span class="p">{</span>
    <span class="s">"1"</span><span class="p">:</span> <span class="p">{</span>
        <span class="s">"name"</span><span class="p">:</span> <span class="s">"meta/llama-3.2-11b-vision-instruct"</span><span class="p">,</span>
        <span class="s">"description"</span><span class="p">:</span> <span class="s">"11B 모델 (더 빠름)"</span>
    <span class="p">},</span>
    <span class="s">"2"</span><span class="p">:</span> <span class="p">{</span>
        <span class="s">"name"</span><span class="p">:</span> <span class="s">"meta/llama-3.2-90b-vision-instruct"</span><span class="p">,</span>
        <span class="s">"description"</span><span class="p">:</span> <span class="s">"90B 모델 (더 정확함)"</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="c1"># 모델 선택
</span><span class="k">print</span><span class="p">(</span><span class="s">"사용할 모델을 선택해주세요:"</span><span class="p">)</span>
<span class="k">for</span> <span class="n">key</span><span class="p">,</span> <span class="n">model</span> <span class="ow">in</span> <span class="n">MODEL_CONFIGS</span><span class="p">.</span><span class="n">items</span><span class="p">():</span>
    <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">key</span><span class="si">}</span><span class="s">: </span><span class="si">{</span><span class="n">model</span><span class="p">[</span><span class="s">'description'</span><span class="p">]</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>

<span class="n">model_choice</span> <span class="o">=</span> <span class="nb">input</span><span class="p">(</span><span class="s">"선택 (1 또는 2): "</span><span class="p">)</span>
<span class="k">if</span> <span class="n">model_choice</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">MODEL_CONFIGS</span><span class="p">:</span>
    <span class="k">print</span><span class="p">(</span><span class="s">"잘못된 선택입니다. 1 또는 2를 입력해주세요."</span><span class="p">)</span>
    <span class="nb">exit</span><span class="p">()</span>

<span class="n">selected_model</span> <span class="o">=</span> <span class="n">MODEL_CONFIGS</span><span class="p">[</span><span class="n">model_choice</span><span class="p">][</span><span class="s">"name"</span><span class="p">]</span>

<span class="c1"># 입력 방식 선택
</span><span class="k">print</span><span class="p">(</span><span class="s">"</span><span class="se">\n</span><span class="s">이미지 텍스트 추출 방식을 선택해주세요:"</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"1: 파일 업로드"</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"2: 이미지 URL 입력"</span><span class="p">)</span>

<span class="n">input_choice</span> <span class="o">=</span> <span class="nb">input</span><span class="p">(</span><span class="s">"선택 (1 또는 2): "</span><span class="p">)</span>

<span class="k">if</span> <span class="n">input_choice</span> <span class="o">==</span> <span class="s">"1"</span><span class="p">:</span>
    <span class="n">result</span> <span class="o">=</span> <span class="n">process_file_upload</span><span class="p">(</span><span class="n">selected_model</span><span class="p">)</span>
<span class="k">elif</span> <span class="n">input_choice</span> <span class="o">==</span> <span class="s">"2"</span><span class="p">:</span>
    <span class="n">result</span> <span class="o">=</span> <span class="n">process_url_input</span><span class="p">(</span><span class="n">selected_model</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
    <span class="k">print</span><span class="p">(</span><span class="s">"잘못된 선택입니다. 1 또는 2를 입력해주세요."</span><span class="p">)</span>
    <span class="nb">exit</span><span class="p">()</span>

<span class="k">print</span><span class="p">(</span><span class="s">"</span><span class="se">\n</span><span class="s">추출된 텍스트:"</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">result</span><span class="p">)</span>
</code></pre></div></div>

<p><strong>Output:</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>사용할 모델을 선택해주세요:
1: 11B 모델 (더 빠름)
2: 90B 모델 (더 정확함)
선택 (1 또는 2): 1

이미지 텍스트 추출 방식을 선택해주세요:
1: 파일 업로드
2: 이미지 URL 입력
선택 (1 또는 2): 1
파일을 업로드해주세요...
</code></pre></div></div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&lt;IPython.core.display.HTML object&gt;
</code></pre></div></div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Saving kakao.png to kakao (5).png

추출된 텍스트:
Your browser does not support the audio element.
</code></pre></div></div>

<h3 id="11b--이미지-url">11B &amp; 이미지 URL</h3>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">requests</span>
<span class="kn">import</span> <span class="nn">base64</span>
<span class="kn">import</span> <span class="nn">httpx</span>
<span class="kn">from</span> <span class="nn">google.colab</span> <span class="kn">import</span> <span class="n">files</span>

<span class="k">def</span> <span class="nf">get_image_text_from_base64</span><span class="p">(</span><span class="n">image_b64</span><span class="p">,</span> <span class="n">model_name</span><span class="p">):</span>
    <span class="s">"""
    base64 인코딩된 이미지에서 텍스트 추출함
    """</span>
    <span class="k">assert</span> <span class="nb">len</span><span class="p">(</span><span class="n">image_b64</span><span class="p">)</span> <span class="o">&lt;</span> <span class="mi">180_000</span><span class="p">,</span> <span class="s">"이미지가 너무 큼. assets API 사용 필요함"</span>

    <span class="n">headers</span> <span class="o">=</span> <span class="p">{</span>
        <span class="s">"Authorization"</span><span class="p">:</span> <span class="sa">f</span><span class="s">"Bearer </span><span class="si">{</span><span class="n">NVIDIA_API_KEY</span><span class="si">}</span><span class="s">"</span><span class="p">,</span>
        <span class="s">"Accept"</span><span class="p">:</span> <span class="s">"application/json"</span>
    <span class="p">}</span>

    <span class="n">invoke_url</span> <span class="o">=</span> <span class="sa">f</span><span class="s">"https://ai.api.nvidia.com/v1/gr/</span><span class="si">{</span><span class="n">model_name</span><span class="si">}</span><span class="s">/chat/completions"</span>

    <span class="n">prompt</span> <span class="o">=</span> <span class="s">"""업로드된 이미지 파일에서 텍스트만 추출(OCR)하고, !!!추출된 텍스트 외에는 아무말도 하지 말 것!!!
    즉, "업로드된 이미지 파일에서 추출한 텍스트는 다음과 같습니다."와 같은 쓸데없는 어구는 !!절대!! 포함되면 안 됨.
    """</span>

    <span class="n">payload</span> <span class="o">=</span> <span class="p">{</span>
        <span class="s">"model"</span><span class="p">:</span> <span class="n">model_name</span><span class="p">,</span>
        <span class="s">"messages"</span><span class="p">:</span> <span class="p">[</span>
            <span class="p">{</span>
                <span class="s">"role"</span><span class="p">:</span> <span class="s">"user"</span><span class="p">,</span>
                <span class="s">"content"</span><span class="p">:</span> <span class="sa">f</span><span class="s">'</span><span class="si">{</span><span class="n">prompt</span><span class="si">}</span><span class="s"> &lt;img src="data:image/png;base64,</span><span class="si">{</span><span class="n">image_b64</span><span class="si">}</span><span class="s">" /&gt;'</span>
            <span class="p">}</span>
        <span class="p">],</span>
        <span class="s">"max_tokens"</span><span class="p">:</span> <span class="mi">1024</span><span class="p">,</span>
        <span class="s">"temperature"</span><span class="p">:</span> <span class="mf">0.4</span><span class="p">,</span>
        <span class="s">"top_p"</span><span class="p">:</span> <span class="mf">0.95</span><span class="p">,</span>
        <span class="s">"stream"</span><span class="p">:</span> <span class="bp">False</span>
    <span class="p">}</span>

    <span class="n">response</span> <span class="o">=</span> <span class="n">requests</span><span class="p">.</span><span class="n">post</span><span class="p">(</span><span class="n">invoke_url</span><span class="p">,</span> <span class="n">headers</span><span class="o">=</span><span class="n">headers</span><span class="p">,</span> <span class="n">json</span><span class="o">=</span><span class="n">payload</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">response</span><span class="p">.</span><span class="n">json</span><span class="p">()[</span><span class="s">'choices'</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s">'message'</span><span class="p">][</span><span class="s">'content'</span><span class="p">]</span>

<span class="k">def</span> <span class="nf">process_file_upload</span><span class="p">(</span><span class="n">model_name</span><span class="p">):</span>
    <span class="s">"""
    사용자가 업로드한 파일 처리함
    """</span>
    <span class="k">print</span><span class="p">(</span><span class="s">"파일을 업로드해주세요..."</span><span class="p">)</span>
    <span class="n">uploaded</span> <span class="o">=</span> <span class="n">files</span><span class="p">.</span><span class="n">upload</span><span class="p">()</span>

    <span class="k">for</span> <span class="n">filename</span> <span class="ow">in</span> <span class="n">uploaded</span><span class="p">.</span><span class="n">keys</span><span class="p">():</span>
        <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="s">"rb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
            <span class="n">image_b64</span> <span class="o">=</span> <span class="n">base64</span><span class="p">.</span><span class="n">b64encode</span><span class="p">(</span><span class="n">f</span><span class="p">.</span><span class="n">read</span><span class="p">()).</span><span class="n">decode</span><span class="p">()</span>
        <span class="k">return</span> <span class="n">get_image_text_from_base64</span><span class="p">(</span><span class="n">image_b64</span><span class="p">,</span> <span class="n">model_name</span><span class="p">)</span>

<span class="k">def</span> <span class="nf">process_url_input</span><span class="p">(</span><span class="n">model_name</span><span class="p">):</span>
    <span class="s">"""
    사용자가 입력한 URL 처리함
    """</span>
    <span class="n">url</span> <span class="o">=</span> <span class="nb">input</span><span class="p">(</span><span class="s">"이미지 URL을 입력해주세요: "</span><span class="p">)</span>
    <span class="n">image_data</span> <span class="o">=</span> <span class="n">base64</span><span class="p">.</span><span class="n">b64encode</span><span class="p">(</span><span class="n">httpx</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">).</span><span class="n">content</span><span class="p">).</span><span class="n">decode</span><span class="p">(</span><span class="s">"utf-8"</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">get_image_text_from_base64</span><span class="p">(</span><span class="n">image_data</span><span class="p">,</span> <span class="n">model_name</span><span class="p">)</span>

<span class="n">MODEL_CONFIGS</span> <span class="o">=</span> <span class="p">{</span>
    <span class="s">"1"</span><span class="p">:</span> <span class="p">{</span>
        <span class="s">"name"</span><span class="p">:</span> <span class="s">"meta/llama-3.2-11b-vision-instruct"</span><span class="p">,</span>
        <span class="s">"description"</span><span class="p">:</span> <span class="s">"11B 모델 (더 빠름)"</span>
    <span class="p">},</span>
    <span class="s">"2"</span><span class="p">:</span> <span class="p">{</span>
        <span class="s">"name"</span><span class="p">:</span> <span class="s">"meta/llama-3.2-90b-vision-instruct"</span><span class="p">,</span>
        <span class="s">"description"</span><span class="p">:</span> <span class="s">"90B 모델 (더 정확함)"</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="c1"># 모델 선택
</span><span class="k">print</span><span class="p">(</span><span class="s">"사용할 모델을 선택해주세요:"</span><span class="p">)</span>
<span class="k">for</span> <span class="n">key</span><span class="p">,</span> <span class="n">model</span> <span class="ow">in</span> <span class="n">MODEL_CONFIGS</span><span class="p">.</span><span class="n">items</span><span class="p">():</span>
    <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">key</span><span class="si">}</span><span class="s">: </span><span class="si">{</span><span class="n">model</span><span class="p">[</span><span class="s">'description'</span><span class="p">]</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>

<span class="n">model_choice</span> <span class="o">=</span> <span class="nb">input</span><span class="p">(</span><span class="s">"선택 (1 또는 2): "</span><span class="p">)</span>
<span class="k">if</span> <span class="n">model_choice</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">MODEL_CONFIGS</span><span class="p">:</span>
    <span class="k">print</span><span class="p">(</span><span class="s">"잘못된 선택입니다. 1 또는 2를 입력해주세요."</span><span class="p">)</span>
    <span class="nb">exit</span><span class="p">()</span>

<span class="n">selected_model</span> <span class="o">=</span> <span class="n">MODEL_CONFIGS</span><span class="p">[</span><span class="n">model_choice</span><span class="p">][</span><span class="s">"name"</span><span class="p">]</span>

<span class="c1"># 입력 방식 선택
</span><span class="k">print</span><span class="p">(</span><span class="s">"</span><span class="se">\n</span><span class="s">이미지 텍스트 추출 방식을 선택해주세요:"</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"1: 파일 업로드"</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"2: 이미지 URL 입력"</span><span class="p">)</span>

<span class="n">input_choice</span> <span class="o">=</span> <span class="nb">input</span><span class="p">(</span><span class="s">"선택 (1 또는 2): "</span><span class="p">)</span>

<span class="k">if</span> <span class="n">input_choice</span> <span class="o">==</span> <span class="s">"1"</span><span class="p">:</span>
    <span class="n">result</span> <span class="o">=</span> <span class="n">process_file_upload</span><span class="p">(</span><span class="n">selected_model</span><span class="p">)</span>
<span class="k">elif</span> <span class="n">input_choice</span> <span class="o">==</span> <span class="s">"2"</span><span class="p">:</span>
    <span class="n">result</span> <span class="o">=</span> <span class="n">process_url_input</span><span class="p">(</span><span class="n">selected_model</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
    <span class="k">print</span><span class="p">(</span><span class="s">"잘못된 선택입니다. 1 또는 2를 입력해주세요."</span><span class="p">)</span>
    <span class="nb">exit</span><span class="p">()</span>

<span class="k">print</span><span class="p">(</span><span class="s">"</span><span class="se">\n</span><span class="s">추출된 텍스트:"</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">result</span><span class="p">)</span>
</code></pre></div></div>

<p><strong>Output:</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>사용할 모델을 선택해주세요:
1: 11B 모델 (더 빠름)
2: 90B 모델 (더 정확함)
선택 (1 또는 2): 1

이미지 텍스트 추출 방식을 선택해주세요:
1: 파일 업로드
2: 이미지 URL 입력
선택 (1 또는 2): 2
이미지 URL을 입력해주세요: https://pbs.twimg.com/media/FSZuTSZWQAMzlBW.jpg:large

추출된 텍스트:
Attention Is All You Need

Adept
Ashish Vaswani*
Google Brain
avaswani@google.com
Noam Shazeer*
Google Brain
noam@google.com
Niki Parmar*
Google Research
nikip@google.com
Jakob Uszkoreit*
Google Research
uszkoreit@google.com

Llion Jones*
Google Research
llion@google.com
Aidan N. Gomez*
University of Toronto
aidan@cs.toronto.edu
Illia Polosukhin*
illia.polosukhin@gmail.com
Lukasz Kaiser*
Google Brain
lukaszaiser@google.com

co:here
NEAR INCORPORATED
Character.ai
Inceptive
</code></pre></div></div>

<h3 id="90b--파일-업로드">90B &amp; 파일 업로드</h3>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">requests</span>
<span class="kn">import</span> <span class="nn">base64</span>
<span class="kn">import</span> <span class="nn">httpx</span>
<span class="kn">from</span> <span class="nn">google.colab</span> <span class="kn">import</span> <span class="n">files</span>

<span class="k">def</span> <span class="nf">get_image_text_from_base64</span><span class="p">(</span><span class="n">image_b64</span><span class="p">,</span> <span class="n">model_name</span><span class="p">):</span>
    <span class="s">"""
    base64 인코딩된 이미지에서 텍스트 추출함
    """</span>
    <span class="k">assert</span> <span class="nb">len</span><span class="p">(</span><span class="n">image_b64</span><span class="p">)</span> <span class="o">&lt;</span> <span class="mi">180_000</span><span class="p">,</span> <span class="s">"이미지가 너무 큼. assets API 사용 필요함"</span>

    <span class="n">headers</span> <span class="o">=</span> <span class="p">{</span>
        <span class="s">"Authorization"</span><span class="p">:</span> <span class="sa">f</span><span class="s">"Bearer </span><span class="si">{</span><span class="n">NVIDIA_API_KEY</span><span class="si">}</span><span class="s">"</span><span class="p">,</span>
        <span class="s">"Accept"</span><span class="p">:</span> <span class="s">"application/json"</span>
    <span class="p">}</span>

    <span class="n">invoke_url</span> <span class="o">=</span> <span class="sa">f</span><span class="s">"https://ai.api.nvidia.com/v1/gr/</span><span class="si">{</span><span class="n">model_name</span><span class="si">}</span><span class="s">/chat/completions"</span>

    <span class="n">prompt</span> <span class="o">=</span> <span class="s">"""업로드된 이미지 파일에서 텍스트만 추출(OCR)하고, !!!추출된 텍스트 외에는 아무말도 하지 말 것!!!
    즉, "업로드된 이미지 파일에서 추출한 텍스트는 다음과 같습니다."와 같은 쓸데없는 어구는 !!절대!! 포함되면 안 됨.
    """</span>

    <span class="n">payload</span> <span class="o">=</span> <span class="p">{</span>
        <span class="s">"model"</span><span class="p">:</span> <span class="n">model_name</span><span class="p">,</span>
        <span class="s">"messages"</span><span class="p">:</span> <span class="p">[</span>
            <span class="p">{</span>
                <span class="s">"role"</span><span class="p">:</span> <span class="s">"user"</span><span class="p">,</span>
                <span class="s">"content"</span><span class="p">:</span> <span class="sa">f</span><span class="s">'</span><span class="si">{</span><span class="n">prompt</span><span class="si">}</span><span class="s"> &lt;img src="data:image/png;base64,</span><span class="si">{</span><span class="n">image_b64</span><span class="si">}</span><span class="s">" /&gt;'</span>
            <span class="p">}</span>
        <span class="p">],</span>
        <span class="s">"max_tokens"</span><span class="p">:</span> <span class="mi">1024</span><span class="p">,</span>
        <span class="s">"temperature"</span><span class="p">:</span> <span class="mf">0.4</span><span class="p">,</span>
        <span class="s">"top_p"</span><span class="p">:</span> <span class="mf">0.95</span><span class="p">,</span>
        <span class="s">"stream"</span><span class="p">:</span> <span class="bp">False</span>
    <span class="p">}</span>

    <span class="n">response</span> <span class="o">=</span> <span class="n">requests</span><span class="p">.</span><span class="n">post</span><span class="p">(</span><span class="n">invoke_url</span><span class="p">,</span> <span class="n">headers</span><span class="o">=</span><span class="n">headers</span><span class="p">,</span> <span class="n">json</span><span class="o">=</span><span class="n">payload</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">response</span><span class="p">.</span><span class="n">json</span><span class="p">()[</span><span class="s">'choices'</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s">'message'</span><span class="p">][</span><span class="s">'content'</span><span class="p">]</span>

<span class="k">def</span> <span class="nf">process_file_upload</span><span class="p">(</span><span class="n">model_name</span><span class="p">):</span>
    <span class="s">"""
    사용자가 업로드한 파일 처리함
    """</span>
    <span class="k">print</span><span class="p">(</span><span class="s">"파일을 업로드해주세요..."</span><span class="p">)</span>
    <span class="n">uploaded</span> <span class="o">=</span> <span class="n">files</span><span class="p">.</span><span class="n">upload</span><span class="p">()</span>

    <span class="k">for</span> <span class="n">filename</span> <span class="ow">in</span> <span class="n">uploaded</span><span class="p">.</span><span class="n">keys</span><span class="p">():</span>
        <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="s">"rb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
            <span class="n">image_b64</span> <span class="o">=</span> <span class="n">base64</span><span class="p">.</span><span class="n">b64encode</span><span class="p">(</span><span class="n">f</span><span class="p">.</span><span class="n">read</span><span class="p">()).</span><span class="n">decode</span><span class="p">()</span>
        <span class="k">return</span> <span class="n">get_image_text_from_base64</span><span class="p">(</span><span class="n">image_b64</span><span class="p">,</span> <span class="n">model_name</span><span class="p">)</span>

<span class="k">def</span> <span class="nf">process_url_input</span><span class="p">(</span><span class="n">model_name</span><span class="p">):</span>
    <span class="s">"""
    사용자가 입력한 URL 처리함
    """</span>
    <span class="n">url</span> <span class="o">=</span> <span class="nb">input</span><span class="p">(</span><span class="s">"이미지 URL을 입력해주세요: "</span><span class="p">)</span>
    <span class="n">image_data</span> <span class="o">=</span> <span class="n">base64</span><span class="p">.</span><span class="n">b64encode</span><span class="p">(</span><span class="n">httpx</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">).</span><span class="n">content</span><span class="p">).</span><span class="n">decode</span><span class="p">(</span><span class="s">"utf-8"</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">get_image_text_from_base64</span><span class="p">(</span><span class="n">image_data</span><span class="p">,</span> <span class="n">model_name</span><span class="p">)</span>

<span class="n">MODEL_CONFIGS</span> <span class="o">=</span> <span class="p">{</span>
    <span class="s">"1"</span><span class="p">:</span> <span class="p">{</span>
        <span class="s">"name"</span><span class="p">:</span> <span class="s">"meta/llama-3.2-11b-vision-instruct"</span><span class="p">,</span>
        <span class="s">"description"</span><span class="p">:</span> <span class="s">"11B 모델 (더 빠름)"</span>
    <span class="p">},</span>
    <span class="s">"2"</span><span class="p">:</span> <span class="p">{</span>
        <span class="s">"name"</span><span class="p">:</span> <span class="s">"meta/llama-3.2-90b-vision-instruct"</span><span class="p">,</span>
        <span class="s">"description"</span><span class="p">:</span> <span class="s">"90B 모델 (더 정확함)"</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="c1"># 모델 선택
</span><span class="k">print</span><span class="p">(</span><span class="s">"사용할 모델을 선택해주세요:"</span><span class="p">)</span>
<span class="k">for</span> <span class="n">key</span><span class="p">,</span> <span class="n">model</span> <span class="ow">in</span> <span class="n">MODEL_CONFIGS</span><span class="p">.</span><span class="n">items</span><span class="p">():</span>
    <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">key</span><span class="si">}</span><span class="s">: </span><span class="si">{</span><span class="n">model</span><span class="p">[</span><span class="s">'description'</span><span class="p">]</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>

<span class="n">model_choice</span> <span class="o">=</span> <span class="nb">input</span><span class="p">(</span><span class="s">"선택 (1 또는 2): "</span><span class="p">)</span>
<span class="k">if</span> <span class="n">model_choice</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">MODEL_CONFIGS</span><span class="p">:</span>
    <span class="k">print</span><span class="p">(</span><span class="s">"잘못된 선택입니다. 1 또는 2를 입력해주세요."</span><span class="p">)</span>
    <span class="nb">exit</span><span class="p">()</span>

<span class="n">selected_model</span> <span class="o">=</span> <span class="n">MODEL_CONFIGS</span><span class="p">[</span><span class="n">model_choice</span><span class="p">][</span><span class="s">"name"</span><span class="p">]</span>

<span class="c1"># 입력 방식 선택
</span><span class="k">print</span><span class="p">(</span><span class="s">"</span><span class="se">\n</span><span class="s">이미지 텍스트 추출 방식을 선택해주세요:"</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"1: 파일 업로드"</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"2: 이미지 URL 입력"</span><span class="p">)</span>

<span class="n">input_choice</span> <span class="o">=</span> <span class="nb">input</span><span class="p">(</span><span class="s">"선택 (1 또는 2): "</span><span class="p">)</span>

<span class="k">if</span> <span class="n">input_choice</span> <span class="o">==</span> <span class="s">"1"</span><span class="p">:</span>
    <span class="n">result</span> <span class="o">=</span> <span class="n">process_file_upload</span><span class="p">(</span><span class="n">selected_model</span><span class="p">)</span>
<span class="k">elif</span> <span class="n">input_choice</span> <span class="o">==</span> <span class="s">"2"</span><span class="p">:</span>
    <span class="n">result</span> <span class="o">=</span> <span class="n">process_url_input</span><span class="p">(</span><span class="n">selected_model</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
    <span class="k">print</span><span class="p">(</span><span class="s">"잘못된 선택입니다. 1 또는 2를 입력해주세요."</span><span class="p">)</span>
    <span class="nb">exit</span><span class="p">()</span>

<span class="k">print</span><span class="p">(</span><span class="s">"</span><span class="se">\n</span><span class="s">추출된 텍스트:"</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">result</span><span class="p">)</span>
</code></pre></div></div>

<p><strong>Output:</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>사용할 모델을 선택해주세요:
1: 11B 모델 (더 빠름)
2: 90B 모델 (더 정확함)
선택 (1 또는 2): 2

이미지 텍스트 추출 방식을 선택해주세요:
1: 파일 업로드
2: 이미지 URL 입력
선택 (1 또는 2): 1
파일을 업로드해주세요...
</code></pre></div></div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&lt;IPython.core.display.HTML object&gt;
</code></pre></div></div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Saving kakao.png to kakao (4).png

추출된 텍스트:
&lt;progressBar class="progress-bar"&gt;&lt;/progress&gt; &lt;audio id="audioPlayer" src="https://raw.githubusercontent.com/joonlab/audio-share/main/research_paper_podcast_by_IU-Can_LLMs_Generate_Novel_Research_Ideas.mp3" type="audio/mpeg"&gt;&lt;/audio&gt;
</code></pre></div></div>

<h3 id="90b--이미지-url">90B &amp; 이미지 URL</h3>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">requests</span>
<span class="kn">import</span> <span class="nn">base64</span>
<span class="kn">import</span> <span class="nn">httpx</span>
<span class="kn">from</span> <span class="nn">google.colab</span> <span class="kn">import</span> <span class="n">files</span>

<span class="k">def</span> <span class="nf">get_image_text_from_base64</span><span class="p">(</span><span class="n">image_b64</span><span class="p">,</span> <span class="n">model_name</span><span class="p">):</span>
    <span class="s">"""
    base64 인코딩된 이미지에서 텍스트 추출함
    """</span>
    <span class="k">assert</span> <span class="nb">len</span><span class="p">(</span><span class="n">image_b64</span><span class="p">)</span> <span class="o">&lt;</span> <span class="mi">180_000</span><span class="p">,</span> <span class="s">"이미지가 너무 큼. assets API 사용 필요함"</span>

    <span class="n">headers</span> <span class="o">=</span> <span class="p">{</span>
        <span class="s">"Authorization"</span><span class="p">:</span> <span class="sa">f</span><span class="s">"Bearer </span><span class="si">{</span><span class="n">NVIDIA_API_KEY</span><span class="si">}</span><span class="s">"</span><span class="p">,</span>
        <span class="s">"Accept"</span><span class="p">:</span> <span class="s">"application/json"</span>
    <span class="p">}</span>

    <span class="n">invoke_url</span> <span class="o">=</span> <span class="sa">f</span><span class="s">"https://ai.api.nvidia.com/v1/gr/</span><span class="si">{</span><span class="n">model_name</span><span class="si">}</span><span class="s">/chat/completions"</span>

    <span class="n">prompt</span> <span class="o">=</span> <span class="s">"""업로드된 이미지 파일에서 텍스트만 추출(OCR)하고, !!!추출된 텍스트 외에는 아무말도 하지 말 것!!!
    즉, "업로드된 이미지 파일에서 추출한 텍스트는 다음과 같습니다."와 같은 쓸데없는 어구는 !!절대!! 포함되면 안 됨.
    """</span>

    <span class="n">payload</span> <span class="o">=</span> <span class="p">{</span>
        <span class="s">"model"</span><span class="p">:</span> <span class="n">model_name</span><span class="p">,</span>
        <span class="s">"messages"</span><span class="p">:</span> <span class="p">[</span>
            <span class="p">{</span>
                <span class="s">"role"</span><span class="p">:</span> <span class="s">"user"</span><span class="p">,</span>
                <span class="s">"content"</span><span class="p">:</span> <span class="sa">f</span><span class="s">'</span><span class="si">{</span><span class="n">prompt</span><span class="si">}</span><span class="s"> &lt;img src="data:image/png;base64,</span><span class="si">{</span><span class="n">image_b64</span><span class="si">}</span><span class="s">" /&gt;'</span>
            <span class="p">}</span>
        <span class="p">],</span>
        <span class="s">"max_tokens"</span><span class="p">:</span> <span class="mi">1024</span><span class="p">,</span>
        <span class="s">"temperature"</span><span class="p">:</span> <span class="mf">0.4</span><span class="p">,</span>
        <span class="s">"top_p"</span><span class="p">:</span> <span class="mf">0.95</span><span class="p">,</span>
        <span class="s">"stream"</span><span class="p">:</span> <span class="bp">False</span>
    <span class="p">}</span>

    <span class="n">response</span> <span class="o">=</span> <span class="n">requests</span><span class="p">.</span><span class="n">post</span><span class="p">(</span><span class="n">invoke_url</span><span class="p">,</span> <span class="n">headers</span><span class="o">=</span><span class="n">headers</span><span class="p">,</span> <span class="n">json</span><span class="o">=</span><span class="n">payload</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">response</span><span class="p">.</span><span class="n">json</span><span class="p">()[</span><span class="s">'choices'</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s">'message'</span><span class="p">][</span><span class="s">'content'</span><span class="p">]</span>

<span class="k">def</span> <span class="nf">process_file_upload</span><span class="p">(</span><span class="n">model_name</span><span class="p">):</span>
    <span class="s">"""
    사용자가 업로드한 파일 처리함
    """</span>
    <span class="k">print</span><span class="p">(</span><span class="s">"파일을 업로드해주세요..."</span><span class="p">)</span>
    <span class="n">uploaded</span> <span class="o">=</span> <span class="n">files</span><span class="p">.</span><span class="n">upload</span><span class="p">()</span>

    <span class="k">for</span> <span class="n">filename</span> <span class="ow">in</span> <span class="n">uploaded</span><span class="p">.</span><span class="n">keys</span><span class="p">():</span>
        <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="s">"rb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
            <span class="n">image_b64</span> <span class="o">=</span> <span class="n">base64</span><span class="p">.</span><span class="n">b64encode</span><span class="p">(</span><span class="n">f</span><span class="p">.</span><span class="n">read</span><span class="p">()).</span><span class="n">decode</span><span class="p">()</span>
        <span class="k">return</span> <span class="n">get_image_text_from_base64</span><span class="p">(</span><span class="n">image_b64</span><span class="p">,</span> <span class="n">model_name</span><span class="p">)</span>

<span class="k">def</span> <span class="nf">process_url_input</span><span class="p">(</span><span class="n">model_name</span><span class="p">):</span>
    <span class="s">"""
    사용자가 입력한 URL 처리함
    """</span>
    <span class="n">url</span> <span class="o">=</span> <span class="nb">input</span><span class="p">(</span><span class="s">"이미지 URL을 입력해주세요: "</span><span class="p">)</span>
    <span class="n">image_data</span> <span class="o">=</span> <span class="n">base64</span><span class="p">.</span><span class="n">b64encode</span><span class="p">(</span><span class="n">httpx</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">).</span><span class="n">content</span><span class="p">).</span><span class="n">decode</span><span class="p">(</span><span class="s">"utf-8"</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">get_image_text_from_base64</span><span class="p">(</span><span class="n">image_data</span><span class="p">,</span> <span class="n">model_name</span><span class="p">)</span>

<span class="n">MODEL_CONFIGS</span> <span class="o">=</span> <span class="p">{</span>
    <span class="s">"1"</span><span class="p">:</span> <span class="p">{</span>
        <span class="s">"name"</span><span class="p">:</span> <span class="s">"meta/llama-3.2-11b-vision-instruct"</span><span class="p">,</span>
        <span class="s">"description"</span><span class="p">:</span> <span class="s">"11B 모델 (더 빠름)"</span>
    <span class="p">},</span>
    <span class="s">"2"</span><span class="p">:</span> <span class="p">{</span>
        <span class="s">"name"</span><span class="p">:</span> <span class="s">"meta/llama-3.2-90b-vision-instruct"</span><span class="p">,</span>
        <span class="s">"description"</span><span class="p">:</span> <span class="s">"90B 모델 (더 정확함)"</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="c1"># 모델 선택
</span><span class="k">print</span><span class="p">(</span><span class="s">"사용할 모델을 선택해주세요:"</span><span class="p">)</span>
<span class="k">for</span> <span class="n">key</span><span class="p">,</span> <span class="n">model</span> <span class="ow">in</span> <span class="n">MODEL_CONFIGS</span><span class="p">.</span><span class="n">items</span><span class="p">():</span>
    <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">key</span><span class="si">}</span><span class="s">: </span><span class="si">{</span><span class="n">model</span><span class="p">[</span><span class="s">'description'</span><span class="p">]</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>

<span class="n">model_choice</span> <span class="o">=</span> <span class="nb">input</span><span class="p">(</span><span class="s">"선택 (1 또는 2): "</span><span class="p">)</span>
<span class="k">if</span> <span class="n">model_choice</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">MODEL_CONFIGS</span><span class="p">:</span>
    <span class="k">print</span><span class="p">(</span><span class="s">"잘못된 선택입니다. 1 또는 2를 입력해주세요."</span><span class="p">)</span>
    <span class="nb">exit</span><span class="p">()</span>

<span class="n">selected_model</span> <span class="o">=</span> <span class="n">MODEL_CONFIGS</span><span class="p">[</span><span class="n">model_choice</span><span class="p">][</span><span class="s">"name"</span><span class="p">]</span>

<span class="c1"># 입력 방식 선택
</span><span class="k">print</span><span class="p">(</span><span class="s">"</span><span class="se">\n</span><span class="s">이미지 텍스트 추출 방식을 선택해주세요:"</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"1: 파일 업로드"</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"2: 이미지 URL 입력"</span><span class="p">)</span>

<span class="n">input_choice</span> <span class="o">=</span> <span class="nb">input</span><span class="p">(</span><span class="s">"선택 (1 또는 2): "</span><span class="p">)</span>

<span class="k">if</span> <span class="n">input_choice</span> <span class="o">==</span> <span class="s">"1"</span><span class="p">:</span>
    <span class="n">result</span> <span class="o">=</span> <span class="n">process_file_upload</span><span class="p">(</span><span class="n">selected_model</span><span class="p">)</span>
<span class="k">elif</span> <span class="n">input_choice</span> <span class="o">==</span> <span class="s">"2"</span><span class="p">:</span>
    <span class="n">result</span> <span class="o">=</span> <span class="n">process_url_input</span><span class="p">(</span><span class="n">selected_model</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
    <span class="k">print</span><span class="p">(</span><span class="s">"잘못된 선택입니다. 1 또는 2를 입력해주세요."</span><span class="p">)</span>
    <span class="nb">exit</span><span class="p">()</span>

<span class="k">print</span><span class="p">(</span><span class="s">"</span><span class="se">\n</span><span class="s">추출된 텍스트:"</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">result</span><span class="p">)</span>
</code></pre></div></div>

<p><strong>Output:</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>사용할 모델을 선택해주세요:
1: 11B 모델 (더 빠름)
2: 90B 모델 (더 정확함)
선택 (1 또는 2): 2

이미지 텍스트 추출 방식을 선택해주세요:
1: 파일 업로드
2: 이미지 URL 입력
선택 (1 또는 2): 2
이미지 URL을 입력해주세요: https://pbs.twimg.com/media/FSZuTSZWQAMzlBW.jpg:large

추출된 텍스트:
Attention Is All You Need
Adept
Ashish Vaswani*
Google Brain
avaswani@google.com
Noam Shazeer*
Google Brain
noam@google.com
Niki Parmar*
Google Research
nikip@google.com
Jakob Uszkoreit*
Google Research
usz@google.com
Llion Jones*
Google Research
llion@google.com
Aidan N. Gomez*
University of Toronto
aidan@cs.toronto.edu
Łukasz Kaiser*
Google Brain
lukaszkaiser@google.com
Illia Polosukhin*
illia.polosukhin@gmail.com
NEAR INCORPORATED
</code></pre></div></div>]]></content><author><name>Park Joon</name></author><category term="코딩" /><category term="LLM" /><category term="python" /><category term="llm" /><category term="api" /><category term="nvidia" /><category term="llama" /><summary type="html"><![CDATA[.ipynb to .md 변환]]></summary></entry><entry><title type="html">깃헙(Github) 블로그 만들기 - 테디노트 playlist</title><link href="https://joonlab.github.io/%EB%B8%94%EB%A1%9C%EA%B7%B8/teddynote-github-blog-playlist/" rel="alternate" type="text/html" title="깃헙(Github) 블로그 만들기 - 테디노트 playlist" /><published>2024-11-07T00:00:00+09:00</published><updated>2024-11-07T00:00:00+09:00</updated><id>https://joonlab.github.io/%EB%B8%94%EB%A1%9C%EA%B7%B8/teddynote-github-blog-playlist</id><content type="html" xml:base="https://joonlab.github.io/%EB%B8%94%EB%A1%9C%EA%B7%B8/teddynote-github-blog-playlist/"><![CDATA[<h2 id="시즌-1">시즌 1</h2>

<h3 id="ep01-개발환경-설치하기">EP01. 개발환경 설치하기</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/--MMmHbSH9k" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>본 영상에서는 시즌1 튜토리얼 영상을 진행하기 위한 기본 개발환경을 설치하는 방법을 다룹니다.

[참고 링크]
1. Quick Start Guide
https://mmistakes.github.io/minimal-mistakes/docs/quick-start-guide/

2. typora 다운로드
https://typora.io/

3. visual studio code 다운로드
https://code.visualstudio.com/

본 영상은 깃헙(Github) 블로그 만들기 - 시즌1의 일부 입니다.

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tfcert-vod
테디노트(깃헙 블로그) : https://teddylee777.github.io
머신러닝 혼자서 스터디 : https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep02-이미지-매우-간단하게-추가하기">EP02. 이미지 매우 간단하게 추가하기</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/1UEOWcKcVdk" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>이번 영상에서는 이미지를 매우 쉽게 추가하고 업로드 하는 방법에 대해서 다룹니다.

본 영상은 깃헙(Github) 블로그 만들기 - 시즌1의 일부 입니다.

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tfcert-vod
테디노트(깃헙 블로그) : https://teddylee777.github.io
머신러닝 혼자서 스터디 : https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep03-업데이트-내역을-실시간-확인하기-로컬-개발환경-설정방법">EP03. 업데이트 내역을 실시간 확인하기!! (로컬 개발환경 설정방법)</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/0TeHUqSAb6Q" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>이번 영상에서는 로컬 개발 환경을 설정하고 변경사항을 실시간으로 확인하는 방법에 대하여 알아봅시다!
더 이상 서버에 올리고 확인 NONO!

본 영상은 깃헙(Github) 블로그 만들기 - 시즌1의 일부 입니다.

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tfcert-vod
테디노트(깃헙 블로그) : https://teddylee777.github.io
머신러닝 혼자서 스터디 : https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep04-블로그-설정-매우-쉽게-변경하기---no코딩-configyml-활용">EP04. 블로그 설정 매우 쉽게 변경하기 - NO코딩! (config.yml 활용)</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/c-h3XcDjHtQ" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>지킬 블로그의 핵심인 config.yml 파일에 설정만 잘해줘도 훌륭한 블로그가 됩니다.

[참고 링크]
1. Quick Start Guide
https://mmistakes.github.io/minimal-mistakes/docs/quick-start-guide/

본 영상은 깃헙(Github) 블로그 만들기 - 시즌1의 일부 입니다.

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tfcert-vod
테디노트(깃헙 블로그) : https://teddylee777.github.io
머신러닝 혼자서 스터디 : https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep05-댓글--구글-애널리틱스-추가하기">EP05. 댓글 &amp; 구글 애널리틱스 추가하기</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/anXaW9xhgcU" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>이번 영상에서는 블로그에 댓글 기능과 
구글 애널리틱스 를 추가해 보도록 하겠습니다.

[참고 링크]
1. Disqus 댓글
https://disqus.com/

2. 구글 애널리틱스
https://analytics.google.com/analytics/web/

본 영상은 깃헙(Github) 블로그 만들기 - 시즌1의 일부 입니다.

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tfcert-vod
테디노트(깃헙 블로그) : https://teddylee777.github.io
머신러닝 혼자서 스터디 : https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep06-테마변경-sns-링크-삽입-pagination-설정">EP06. 테마변경, SNS 링크 삽입, Pagination 설정</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/Wi1W3hpfvZc" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>이번 영상에서는 테마 변경, SNS 링크 삽입, Pagination 설정등의 기능에 대하여 알아보겠습니다.

[참고 링크]
1. 테마변경
https://mmistakes.github.io/minimal-mistakes/docs/configuration/#air-skin-air

2. Locale 변경
https://mmistakes.github.io/minimal-mistakes/docs/configuration/#site-locale

3. Pagination
https://mmistakes.github.io/minimal-mistakes/docs/configuration/#outputting

본 영상은 깃헙(Github) 블로그 만들기 - 시즌1의 일부 입니다.

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tfcert-vod
테디노트(깃헙 블로그) : https://teddylee777.github.io
머신러닝 혼자서 스터디 : https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep07-카테고리-기능-태그-기능-추가하기">EP07. 카테고리 기능, #태그 기능 추가하기</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/3UOh0rKlxjg" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>이번 영상에서는 카테고리 기능과 필터링 
그리고 태그 기능을 추가해 보도록 하겠습니다.

[참고 링크]
1. 네비게이션
https://mmistakes.github.io/minimal-mistakes/docs/navigation/

본 영상은 깃헙(Github) 블로그 만들기 - 시즌1의 일부 입니다.

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tfcert-vod
테디노트(깃헙 블로그) : https://teddylee777.github.io
머신러닝 혼자서 스터디 : https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep08-글-목차-404-페이지-에러-구현">EP08. 글 목차, 404 페이지 에러 구현</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/OoeGqYu8JFQ" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>이번 영상에서는 글 목차 (Table of Contents)를 추가하고
404 에러 페이지를 커스터마이징 해 보도록 하겠습니다.

[참고 링크]
1. 글 목차 (Table of Contents)
https://mmistakes.github.io/minimal-mistakes/docs/helpers/#table-of-contents

2. 404 페이지 에러
https://mmistakes.github.io/minimal-mistakes/docs/pages/

본 영상은 깃헙(Github) 블로그 만들기 - 시즌1의 일부 입니다.

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tfcert-vod
테디노트(깃헙 블로그) : https://teddylee777.github.io
머신러닝 혼자서 스터디 : https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep09-구글-네이버-검색엔진-등록하기">EP09. 구글, 네이버 검색엔진 등록하기</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/OxRZrg0u6h4" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>이번 영상에서는 구글, 네이버 검색 엔진에 
블로그가 노출될 수 있도록 등록하는 방법에 대해 알아보겠습니다.

[참고 링크]
1. Google Search Console
https://search.google.com/search-console

2. 네이버 Search Advisor
https://searchadvisor.naver.com/

본 영상은 깃헙(Github) 블로그 만들기 - 시즌1의 일부 입니다.

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tfcert-vod
테디노트(깃헙 블로그) : https://teddylee777.github.io
머신러닝 혼자서 스터디 : https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep10-블로그-내-글-검색기능-추가하기">EP10. 블로그 내 글 검색기능 추가하기</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/AONVKTeeaWY" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>이번 영상에서는 블로그 내부에서 검색할 수 있는
"검색" 기능을 추가해 보도록 하겠습니다.

본 영상은 깃헙(Github) 블로그 만들기 - 시즌1의 일부 입니다.

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tfcert-vod
테디노트(깃헙 블로그) : https://teddylee777.github.io
머신러닝 혼자서 스터디 : https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep11-블로그에-설정된-폰트글씨체-변경하기">EP11. 블로그에 설정된 폰트(글씨체) 변경하기</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/k7DjQ1JF9rY" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>본 영상에서는 블로그에 설정된 폰트를 바꾸는 방법을 다룹니다.

[참고 링크]
1. 구글 폰트
https://fonts.google.com/?subset=korean

본 영상은 깃헙(Github) 블로그 만들기 - 시즌1의 일부 입니다.

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tfcert-vod
테디노트(깃헙 블로그) : https://teddylee777.github.io
머신러닝 혼자서 스터디 : https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep12-공지사항notice-버튼-youtube-영상-추가하기">EP12. 공지사항(Notice), 버튼, YouTube 영상 추가하기</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/q0P3TSoVNDM" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>본 영상에서는 공지사항, 버튼 추가 그리고 다양한 영상을 
깔끔하게 추가(Embedding) 하는 방법을 다룹니다.

[참고 링크]
1. 공지사항 추가하기
https://mmistakes.github.io/minimal-mistakes/docs/utility-classes/#notices

2. 버튼 추가하기
https://mmistakes.github.io/minimal-mistakes/docs/utility-classes/#buttons

3. 영상 추가하기 (YouTube, Vimeo, Google Drive 모두 가능)
https://mmistakes.github.io/minimal-mistakes/docs/helpers/#responsive-video-embed

본 영상은 깃헙(Github) 블로그 만들기 - 시즌1의 일부 입니다.

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tfcert-vod
테디노트(깃헙 블로그) : https://teddylee777.github.io
머신러닝 혼자서 스터디 : https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h2 id="시즌-2">시즌 2</h2>

<h3 id="ep14-깃헙github-블로그-만들기-시즌2를-시작합니다">EP14. 깃헙(Github) 블로그 만들기 시즌2를 시작합니다!</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/p1cdQPw-JME" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>본 영상은 깃헙(Github) 블로그 만들기 - 시즌2의 일부 입니다.
시리즈 강의로 제작되었습니다😁

블로그 “테디노트” 보러가기
📌 https://teddylee777.github.io/

✔️ 데모페이지: https://github.com/teddynote/teddynote.github.io
✔️ 도큐먼트: https://mmistakes.github.io/minimal-mistakes/docs/configuration/

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tf-cert-inflearn
테디노트(깃헙 블로그) : https://teddylee777.github.io
머신러닝 혼자서 스터디 : https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep15-최신-업데이트-내역-블로그에-적용하기최신-기능-적용">EP15. 최신 업데이트 내역 블로그에 적용하기(최신 기능 적용)</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/zoZ4LF-8j2E" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>본 영상은 깃헙(Github) 블로그 만들기 - 시즌2의 일부 입니다.
시리즈 강의로 제작되었습니다😁

블로그 “테디노트” 보러가기
📌 https://teddylee777.github.io/

✔️ 데모페이지: https://github.com/teddynote/teddynote.github.io
✔️ 도큐먼트: https://mmistakes.github.io/minimal-mistakes/docs/configuration/

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tf-cert-inflearn
테디노트(깃헙 블로그) : https://teddylee777.github.io
머신러닝 혼자서 스터디 : https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep16-메인-페이지에-사진-추가-유튜브-아이콘링크-추가하기">EP16. 메인 페이지에 사진 추가, 유튜브 아이콘/링크 추가하기</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/PODVNQI6QL0" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>본 영상은 깃헙(Github) 블로그 만들기 - 시즌2의 일부 입니다.
시리즈 강의로 제작되었습니다😁

블로그 “테디노트” 보러가기
📌 https://teddylee777.github.io/

✔️ 데모페이지: https://github.com/teddynote/teddynote.github.io
✔️ 도큐먼트: https://mmistakes.github.io/minimal-mistakes/docs/configuration/

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tf-cert-inflearn
테디노트(깃헙 블로그) : https://teddylee777.github.io
머신러닝 혼자서 스터디 : https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep17-포스트-왼쪽-영역-확장-toc-스타일-수정하기css-스타일-수정하는-법">EP17. 포스트 왼쪽 영역 확장, TOC 스타일 수정하기(CSS 스타일 수정하는 법)</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/GIsCf9_jboM" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>본 영상은 깃헙(Github) 블로그 만들기 - 시즌2의 일부 입니다.
시리즈 강의로 제작되었습니다😁

블로그 “테디노트” 보러가기
📌 https://teddylee777.github.io/

✔️ 데모페이지: https://github.com/teddynote/teddynote.github.io
✔️ 도큐먼트: https://mmistakes.github.io/minimal-mistakes/docs/configuration/

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tf-cert-inflearn
테디노트(깃헙 블로그) : https://teddylee777.github.io
머신러닝 혼자서 스터디 : https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep18-이미지-추가시-오류-해결-as-영상">EP18. 이미지 추가시 오류 해결 (A/S 영상)</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/ndJ5B-DyBnA" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>본 영상은 깃헙(Github) 블로그 만들기 - 시즌2의 일부 입니다.
시리즈 강의로 제작되었습니다😁

블로그 “테디노트” 보러가기
📌 https://teddylee777.github.io/

✔️ 데모페이지: https://github.com/teddynote/teddynote.github.io
✔️ 도큐먼트: https://mmistakes.github.io/minimal-mistakes/docs/configuration/

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tf-cert-inflearn
테디노트(깃헙 블로그) : https://teddylee777.github.io
머신러닝 혼자서 스터디 : https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep19-사이트-주소가-바뀌었을-때-redirect_from-플러그인으로-해결하기">EP19. 사이트 주소가 바뀌었을 때. redirect_from 플러그인으로 해결하기</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/aVhu5CEpkSI" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>본 영상은 깃헙(Github) 블로그 만들기 - 시즌2의 일부 입니다.
시리즈 강의로 제작되었습니다😁

블로그 “테디노트” 보러가기
📌 https://teddylee777.github.io/

✔️ 데모페이지: https://github.com/teddynote/teddynote.github.io
✔️ 도큐먼트: https://mmistakes.github.io/minimal-mistakes/docs/configuration/

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tf-cert-inflearn
테디노트(깃헙 블로그) : https://teddylee777.github.io
머신러닝 혼자서 스터디 : https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep20-나만의-커스텀-css-스타일을-마크다운-형식으로-적용하기">EP20. 나만의 커스텀 CSS 스타일을 마크다운 형식으로 적용하기</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/monQhJMsGi4" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>본 영상은 깃헙(Github) 블로그 만들기 - 시즌2의 일부 입니다.
시리즈 강의로 제작되었습니다😁

블로그 “테디노트” 보러가기
📌 https://teddylee777.github.io/

✔️ 데모페이지: https://github.com/teddynote/teddynote.github.io
✔️ 도큐먼트: https://mmistakes.github.io/minimal-mistakes/docs/configuration/

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tf-cert-inflearn
테디노트(깃헙 블로그) : https://teddylee777.github.io
머신러닝 혼자서 스터디 : https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep21-latex-수식-문법을-지원하는-mathjax를-블로그에-적용하기">EP21. LaTeX 수식 문법을 지원하는 mathjax를 블로그에 적용하기</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/3O08iA_BFbM" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>본 영상은 깃헙(Github) 블로그 만들기 - 시즌2의 일부 입니다.
시리즈 강의로 제작되었습니다😁

📌 mathjax-support.html 파일 다운로드 링크
https://www.dropbox.com/s/xwi4hx49tauu95k/mathjax-support.html?dl=1

블로그 “테디노트” 보러가기
📌 https://teddylee777.github.io/

✔️ 데모페이지: https://github.com/teddynote/teddynote.github.io
✔️ 도큐먼트: https://mmistakes.github.io/minimal-mistakes/docs/configuration/

#latex #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tf-cert-inflearn
테디노트(깃헙 블로그): https://teddylee777.github.io
머신러닝 혼자서 스터디: https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep22-연도별-포스팅-아카이브-생성하기">EP22. 연도별 포스팅 아카이브 생성하기</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/251YUs2FGfI" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>본 영상은 깃헙(Github) 블로그 만들기 - 시즌2의 일부 입니다.
시리즈 강의로 제작되었습니다😁

블로그 “테디노트” 보러가기
📌 https://teddylee777.github.io/

✔️ 데모페이지: https://github.com/teddynote/teddynote.github.io
✔️ 도큐먼트: https://mmistakes.github.io/minimal-mistakes/docs/configuration/

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tf-cert-inflearn
테디노트(깃헙 블로그): https://teddylee777.github.io
머신러닝 혼자서 스터디: https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep23-블로그-배포-더이상-기다리지-마세요-배포-브랜치-설정--배포-시점-확인하기">EP23. 블로그 배포 더이상 기다리지 마세요~ 배포 브랜치 설정 &amp; 배포 시점 확인하기</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/5aoIeNvquOE" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>본 영상은 깃헙(Github) 블로그 만들기 - 시즌2의 일부 입니다.
시리즈 강의로 제작되었습니다😁

블로그 “테디노트” 보러가기
📌 https://teddylee777.github.io/

✔️ 데모페이지: https://github.com/teddynote/teddynote.github.io
✔️ 도큐먼트: https://mmistakes.github.io/minimal-mistakes/docs/configuration/

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tf-cert-inflearn
테디노트(깃헙 블로그): https://teddylee777.github.io
머신러닝 혼자서 스터디: https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep24-사이드바에-카테고리--태그-숫자-카운트와-함께-추가하기">EP24. 사이드바에 카테고리 &amp; 태그 숫자 카운트와 함께 추가하기</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/FDFBJ_86sF4" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>본 영상은 깃헙(Github) 블로그 만들기 - 시즌2의 일부 입니다.
시리즈 강의로 제작되었습니다😁

📌 nav_list 파일 다운로드 링크
https://bit.ly/3Y2USN5

블로그 “테디노트” 보러가기
📌 https://teddylee777.github.io/

✔️ 데모페이지: https://github.com/teddynote/teddynote.github.io
✔️ 도큐먼트: https://mmistakes.github.io/minimal-mistakes/docs/configuration/

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tf-cert-inflearn
테디노트(깃헙 블로그): https://teddylee777.github.io
머신러닝 혼자서 스터디: https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="ep25-블로그에-상단-배너-추가하기">EP25. 블로그에 상단 배너 추가하기</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/fo3tpjxZbZQ" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>본 영상은 깃헙(Github) 블로그 만들기 - 시즌2의 일부 입니다.
시리즈 강의로 제작되었습니다😁

블로그 “테디노트” 보러가기
📌 https://teddylee777.github.io/

✔️ 데모페이지: https://github.com/teddynote/teddynote.github.io
✔️ 도큐먼트: https://mmistakes.github.io/minimal-mistakes/docs/configuration/

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tf-cert-inflearn
테디노트(깃헙 블로그): https://teddylee777.github.io
머신러닝 혼자서 스터디: https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h2 id="번외편">번외편</h2>

<h3 id="ep13-번외편-깃헙-커밋-로그에-업데이트가-안된다면">EP13. (번외편) 깃헙 커밋 로그에 업데이트가 안된다면?</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/Z053Qn8LJyk" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>본 영상에서는 깃헙 블로그에 수정사항을 반영했지만
프로필의 커밋로그에 업데이트가 되지 않는 분들을 위한 영상입니다.

본 영상은 깃헙(Github) 블로그 만들기 - 시즌1의 일부 입니다.

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tfcert-vod
테디노트(깃헙 블로그): https://teddylee777.github.io
머신러닝 혼자서 스터디: https://github.com/teddylee777/machine-learning
</code></pre></div></div>

<hr />

<h3 id="번외편-typora를-활용하여-블로그에-이미지-쉽게-추가하기-초간단-셋팅법">번외편. Typora를 활용하여 블로그에 이미지 쉽게 추가하기 (초간단 셋팅법)</h3>

<!-- Courtesy of embedresponsively.com -->

<div class="responsive-video-container">
    <iframe src="https://www.youtube-nocookie.com/embed/R3bMs8wr-jk" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
  </div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>본 영상은 깃헙(Github) 블로그 만들기 - 시즌의 일부 입니다.
시리즈 강의로 제작되었습니다😁

블로그 “테디노트” 보러가기
📌 https://teddylee777.github.io/

✔️ 데모페이지: https://github.com/teddynote/teddynote.github.io
✔️ 도큐먼트: https://mmistakes.github.io/minimal-mistakes/docs/configuration/

#깃헙블로그 #github #githubpage
---
텐서플로우 자격증 취득 강의: https://bit.ly/tf-cert-inflearn
테디노트(깃헙 블로그): https://teddylee777.github.io
머신러닝 혼자서 스터디: https://github.com/teddylee777/machine-learning
</code></pre></div></div>]]></content><author><name>Park Joon</name></author><category term="블로그" /><category term="blog" /><category term="github" /><category term="블로그" /><category term="youtube" /><category term="유튜브" /><summary type="html"><![CDATA[시즌 1]]></summary></entry><entry><title type="html">코드 블록 테스트</title><link href="https://joonlab.github.io/test/code-tutorial/" rel="alternate" type="text/html" title="코드 블록 테스트" /><published>2024-11-05T00:00:00+09:00</published><updated>2024-11-05T00:00:00+09:00</updated><id>https://joonlab.github.io/test/code-tutorial</id><content type="html" xml:base="https://joonlab.github.io/test/code-tutorial/"><![CDATA[<h1 id="코드-블록-및-인라인-코드-테스트">코드 블록 및 인라인 코드 테스트</h1>

<p>이 문서에서는 다양한 코드 언어와 포맷을 Markdown 파일 내에서 테스트합니다. 주어진 설정에 따라 코드가 제대로 렌더링되는지 확인할 수 있습니다.</p>

<h2 id="1-인라인-코드">1. 인라인 코드</h2>

<p>텍스트 내에서 <code class="language-plaintext highlighter-rouge">`print("Hello, World!")`</code>와 같은 인라인 코드 블록을 사용하려면 백틱(`)을 이용해 감싸줍니다. 아래에 몇 가지 인라인 코드 예제를 더 추가해보겠습니다.</p>

<ul>
  <li>Python 예제: <code class="language-plaintext highlighter-rouge">`for i in range(10): print(i)`</code></li>
  <li>JavaScript 예제: <code class="language-plaintext highlighter-rouge">`console.log("Hello, JavaScript!");`</code></li>
  <li>YAML 예제: <code class="language-plaintext highlighter-rouge">`version: '3.8'`</code></li>
</ul>

<h2 id="2-python-코드-블록">2. Python 코드 블록</h2>

<p>Python 코드를 코드블록으로 작성하여 가독성을 높일 수 있습니다. 예:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">fibonacci</span><span class="p">(</span><span class="n">n</span><span class="p">):</span>
    <span class="s">"""Returns the Fibonacci sequence up to n terms."""</span>
    <span class="n">sequence</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">]</span>
    <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="n">n</span><span class="p">):</span>
        <span class="n">sequence</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">sequence</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="n">sequence</span><span class="p">[</span><span class="o">-</span><span class="mi">2</span><span class="p">])</span>
    <span class="k">return</span> <span class="n">sequence</span>

<span class="k">print</span><span class="p">(</span><span class="n">fibonacci</span><span class="p">(</span><span class="mi">10</span><span class="p">))</span>
</code></pre></div></div>

<p>위의 코드에서는 피보나치 수열을 출력하는 간단한 Python 함수를 정의했습니다.</p>

<h2 id="3-json-코드-블록">3. JSON 코드 블록</h2>

<p>JSON 데이터를 코드블록으로 작성하여 구조가 잘 보이게 합니다. 예:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
    </span><span class="nl">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"John Doe"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"age"</span><span class="p">:</span><span class="w"> </span><span class="mi">30</span><span class="p">,</span><span class="w">
    </span><span class="nl">"is_student"</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="p">,</span><span class="w">
    </span><span class="nl">"courses"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
        </span><span class="s2">"Math"</span><span class="p">,</span><span class="w">
        </span><span class="s2">"Science"</span><span class="p">,</span><span class="w">
        </span><span class="s2">"Literature"</span><span class="w">
    </span><span class="p">]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>위 JSON 데이터는 사용자 정보를 저장하는 데 사용될 수 있습니다.</p>

<h2 id="4-javascript-코드-블록">4. JavaScript 코드 블록</h2>

<p>JavaScript 코드를 코드블록으로 작성하여 웹 개발 코드가 깔끔하게 보이도록 합니다. 예:</p>

<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">function</span> <span class="nx">greet</span><span class="p">(</span><span class="nx">name</span><span class="p">)</span> <span class="p">{</span>
    <span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="s2">`Hello, </span><span class="p">${</span><span class="nx">name</span><span class="p">}</span><span class="s2">!`</span><span class="p">);</span>
<span class="p">}</span>

<span class="nx">greet</span><span class="p">(</span><span class="dl">"</span><span class="s2">JavaScript</span><span class="dl">"</span><span class="p">);</span>
</code></pre></div></div>

<p>위의 JavaScript 코드에서는 <code class="language-plaintext highlighter-rouge">greet</code>라는 함수를 통해 이름을 받아 출력하는 기능을 보여줍니다.</p>

<h2 id="5-yaml-코드-블록">5. YAML 코드 블록</h2>

<p>YAML 파일 구문도 코드블록으로 작성할 수 있습니다. YAML은 특히 구성 파일을 작성하는 데 유용합니다. 예:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">version</span><span class="pi">:</span> <span class="s1">'</span><span class="s">3.8'</span>
<span class="na">services</span><span class="pi">:</span>
  <span class="na">web</span><span class="pi">:</span>
    <span class="na">image</span><span class="pi">:</span> <span class="s">nginx:latest</span>
    <span class="na">ports</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="s2">"</span><span class="s">80:80"</span>
    <span class="na">environment</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="s">NGINX_HOST=localhost</span>
      <span class="pi">-</span> <span class="s">NGINX_PORT=80</span>
</code></pre></div></div>

<p>위의 YAML 예제는 Docker Compose 파일을 작성할 때 자주 사용되는 설정 예시입니다.</p>

<h2 id="6-bash-스크립트-코드-블록">6. Bash 스크립트 코드 블록</h2>

<p>Bash 스크립트 코드는 코드블록으로 작성하여 쉘 명령어가 잘 보이도록 합니다. 예:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/bash</span>
<span class="nb">echo</span> <span class="s2">"Hello, Bash!"</span>

<span class="k">for </span>i <span class="k">in</span> <span class="o">{</span>1..5<span class="o">}</span>
<span class="k">do
   </span><span class="nb">echo</span> <span class="s2">"Loop </span><span class="nv">$i</span><span class="s2">"</span>
<span class="k">done</span>
</code></pre></div></div>

<p>위의 Bash 코드에서는 간단한 반복문과 출력 명령어를 보여줍니다.</p>

<h2 id="7-html-코드-블록">7. HTML 코드 블록</h2>

<p>HTML 코드도 코드블록으로 작성할 수 있습니다. 웹 요소를 설명하는 데 유용합니다. 예:</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">&lt;!DOCTYPE html&gt;</span>
<span class="nt">&lt;html</span> <span class="na">lang=</span><span class="s">"en"</span><span class="nt">&gt;</span>
<span class="nt">&lt;head&gt;</span>
    <span class="nt">&lt;meta</span> <span class="na">charset=</span><span class="s">"UTF-8"</span><span class="nt">&gt;</span>
    <span class="nt">&lt;title&gt;</span>HTML Test<span class="nt">&lt;/title&gt;</span>
<span class="nt">&lt;/head&gt;</span>
<span class="nt">&lt;body&gt;</span>
    <span class="nt">&lt;h1&gt;</span>Hello, HTML!<span class="nt">&lt;/h1&gt;</span>
    <span class="nt">&lt;p&gt;</span>This is a paragraph in HTML.<span class="nt">&lt;/p&gt;</span>
<span class="nt">&lt;/body&gt;</span>
<span class="nt">&lt;/html&gt;</span>
</code></pre></div></div>

<p>위 HTML 코드는 간단한 웹 페이지를 생성하는 기본 요소를 포함하고 있습니다.</p>

<h2 id="8-css-코드-블록">8. CSS 코드 블록</h2>

<p>CSS 코드 블록으로 스타일을 적용하는 예제를 보여줍니다.</p>

<div class="language-css highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">body</span> <span class="p">{</span>
    <span class="nl">font-family</span><span class="p">:</span> <span class="n">Arial</span><span class="p">,</span> <span class="nb">sans-serif</span><span class="p">;</span>
    <span class="nl">background-color</span><span class="p">:</span> <span class="m">#f0f0f0</span><span class="p">;</span>
    <span class="nl">color</span><span class="p">:</span> <span class="m">#333</span><span class="p">;</span>
<span class="p">}</span>

<span class="nt">h1</span> <span class="p">{</span>
    <span class="nl">color</span><span class="p">:</span> <span class="m">#0073e6</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>위 CSS 코드는 HTML 요소의 스타일을 지정하는 간단한 예입니다.</p>

<p>이로써 다양한 언어의 코드 블록 테스트를 완료했습니다. 이 페이지에서 각 코드 블록이 잘 렌더링되는지 확인하시기 바랍니다.</p>]]></content><author><name>Park Joon</name></author><category term="test" /><category term="code" /><category term="test" /><category term="Python" /><category term="JSON" /><category term="JavaScript" /><category term="YAML" /><summary type="html"><![CDATA[코드 블록 및 인라인 코드 테스트]]></summary></entry><entry><title type="html">커스텀 CSS 스타일 적용 테스트</title><link href="https://joonlab.github.io/test/custom-css-tutorial/" rel="alternate" type="text/html" title="커스텀 CSS 스타일 적용 테스트" /><published>2024-11-05T00:00:00+09:00</published><updated>2024-11-05T00:00:00+09:00</updated><id>https://joonlab.github.io/test/custom-css-tutorial</id><content type="html" xml:base="https://joonlab.github.io/test/custom-css-tutorial/"><![CDATA[<h2 id="-text-right-사용-예시">{: .text-right} 사용 예시</h2>
<p class="text-right">커스텀 CSS 스타일을 적용하는 방법에 대해서 알아보겠습니다.</p>

<h2 id="-text-center-사용-예시">{: .text-center} 사용 예시</h2>
<p class="text-center">커스텀 CSS 스타일을 적용하는 방법에 대해서 알아보겠습니다.</p>

<h2 id="-text-left-사용-예시">{: .text-left} 사용 예시</h2>
<p class="text-left">커스텀 CSS 스타일을 적용하는 방법에 대해서 알아보겠습니다.</p>

<h2 id="-align-right-사용-예시">{: .align-right} 사용 예시</h2>
<p><img src="https://joonlab.github.io/images/2024-11-04-image-tutorial/image-20241104073010608.png" alt="image-20241104073010608" class="align-right" /></p>

<p><br clear="right" /></p>

<h2 id="-align-center-사용-예시">{: .align-center} 사용 예시</h2>

<p><img src="https://joonlab.github.io/images/2024-11-04-image-tutorial/image-20241104073010608.png" alt="image-20241104073010608" class="align-center" /></p>

<p><br clear="both" /></p>

<h2 id="-align-left-사용-예시">{: .align-left} 사용 예시</h2>
<p><img src="https://joonlab.github.io/images/2024-11-04-image-tutorial/image-20241104073010608.png" alt="image-20241104073010608" class="align-left" /></p>

<p><br clear="left" /></p>

<h2 id="-align-left와--img-width-half-혼합-사용-예시">{: .align-left}와 {: .img-width-half} 혼합 사용 예시</h2>
<p><img src="https://joonlab.github.io/images/2024-11-04-image-tutorial/image-20241104073010608.png" alt="image-20241104073010608" class="img-width-half align-left" /></p>]]></content><author><name>Park Joon</name></author><category term="test" /><summary type="html"><![CDATA[{: .text-right} 사용 예시 커스텀 CSS 스타일을 적용하는 방법에 대해서 알아보겠습니다.]]></summary></entry><entry><title type="html">라텍스 수식 테스트</title><link href="https://joonlab.github.io/test/latex-tutorial/" rel="alternate" type="text/html" title="라텍스 수식 테스트" /><published>2024-11-05T00:00:00+09:00</published><updated>2024-11-05T00:00:00+09:00</updated><id>https://joonlab.github.io/test/latex-tutorial</id><content type="html" xml:base="https://joonlab.github.io/test/latex-tutorial/"><![CDATA[<h1 id="latex-수식-테스트">LaTeX 수식 테스트</h1>

<p>이 문서에서는 다양한 라텍스 수식을 Markdown 파일 내에서 테스트합니다. 주어진 설정에 따라 수식이 제대로 렌더링되는지 확인할 수 있습니다.</p>

<h2 id="1-인라인-수식">1. 인라인 수식</h2>

<p>텍스트와 함께 인라인 수식을 사용하려면 <code class="language-plaintext highlighter-rouge">$...$</code> 문법을 사용합니다.</p>

<p>예: 이차 방정식의 해는 $x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}$ 입니다.</p>

<h2 id="2-블록-수식">2. 블록 수식</h2>

<p>블록 수식을 사용하려면 <code class="language-plaintext highlighter-rouge">$$...$$</code> 문법을 사용하여 수식을 별도의 줄에 배치합니다.</p>

<p>예:</p>

\[x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}\]

<h2 id="3-행렬-표현">3. 행렬 표현</h2>

<p>행렬은 <code class="language-plaintext highlighter-rouge">\begin{matrix} ... \end{matrix}</code> 구문을 사용하여 표현할 수 있습니다.</p>

<p>예:</p>

\[\begin{bmatrix}
    a &amp; b \\
    c &amp; d
\end{bmatrix}\]

<p>또는 더 복잡한 행렬 표현:</p>

\[\begin{pmatrix}
    1 &amp; 2 &amp; 3 \\
    4 &amp; 5 &amp; 6 \\
    7 &amp; 8 &amp; 9
\end{pmatrix}\]

<h2 id="4-함수와-기호">4. 함수와 기호</h2>

<p>미적분 함수와 다양한 수학 기호도 사용할 수 있습니다.</p>

<p>예:</p>

<ul>
  <li>사인 함수: $\sin(\theta)$</li>
  <li>코사인 함수: $\cos(\theta)$</li>
  <li>적분: $\int_{a}^{b} x^2 \,dx$</li>
  <li>무한급수: $\sum_{n=1}^{\infty} \frac{1}{n^2}$</li>
</ul>

<p>블록으로 표현하면 다음과 같습니다.</p>

\[\int_{a}^{b} x^2 \,dx\]

\[\sum_{n=1}^{\infty} \frac{1}{n^2}\]

<h2 id="5-미분-및-적분">5. 미분 및 적분</h2>

<p>미분과 적분을 포함한 다양한 연산을 표현할 수 있습니다.</p>

<p>예:</p>

<ul>
  <li>미분: $\frac{d}{dx} f(x)$</li>
  <li>두 번째 미분: $\frac{d^2}{dx^2} f(x)$</li>
  <li>정적분: $\int_{a}^{b} f(x) \,dx$</li>
  <li>부분적분: $\int u \, dv = uv - \int v \, du$</li>
</ul>

<p>블록으로 표현하면 다음과 같습니다.</p>

\[\frac{d}{dx} f(x) = \lim_{\Delta x \to 0} \frac{f(x + \Delta x) - f(x)}{\Delta x}\]

\[\int_{a}^{b} f(x) \,dx\]

<h2 id="6-케이스-분할">6. 케이스 분할</h2>

<p>수식을 여러 케이스로 나누어 표현할 수도 있습니다.</p>

<p>예:</p>

\[f(x) = 
\begin{cases} 
    x^2 &amp; \text{if } x \geq 0 \\
    -x &amp; \text{if } x &lt; 0 
\end{cases}\]

<h2 id="7-수열-및-급수">7. 수열 및 급수</h2>

<p>수열과 급수를 수식으로 나타낼 수 있습니다.</p>

<p>예:</p>

<ul>
  <li>등차수열: $a_n = a + (n-1)d$</li>
  <li>기하수열: $a_n = a \cdot r^{n-1}$</li>
</ul>

<p>급수 표현:</p>

\[\sum_{n=1}^{\infty} \frac{1}{n^2} = \frac{\pi^2}{6}\]

<h2 id="8-복소수-및-오일러-공식">8. 복소수 및 오일러 공식</h2>

<p>복소수와 오일러 공식도 라텍스로 표현할 수 있습니다.</p>

<p>예:</p>

\[e^{i\theta} = \cos \theta + i \sin \theta\]

<p>그리고 오일러의 유명한 공식:</p>

\[e^{i\pi} + 1 = 0\]

<p>이로써 다양한 수식 테스트를 완료했습니다. 이 페이지에서 수식이 제대로 렌더링되는지 확인하시기 바랍니다.</p>]]></content><author><name>Park Joon</name></author><category term="test" /><category term="latex" /><category term="수식" /><category term="테스트" /><summary type="html"><![CDATA[LaTeX 수식 테스트]]></summary></entry><entry><title type="html">숨고 프로필</title><link href="https://joonlab.github.io/self/soomgo-profile-park-joon/" rel="alternate" type="text/html" title="숨고 프로필" /><published>2024-11-05T00:00:00+09:00</published><updated>2024-11-05T00:00:00+09:00</updated><id>https://joonlab.github.io/self/soomgo-profile-park-joon</id><content type="html" xml:base="https://joonlab.github.io/self/soomgo-profile-park-joon/"><![CDATA[<h1 id="박준">박준</h1>

<p>👨‍🎓수학과 / 인공지능 석사 (Kyoto Univ.) ・수학과외 카테고리 0.01% (리뷰순) ・데이터분석, 코딩 레슨 ・챗봇 등 외주 ✅</p>

<h2 id="리뷰-평점">리뷰 평점</h2>

<ul>
  <li><strong>평점</strong>: 5.0</li>
  <li><strong>리뷰수</strong>: 32</li>
  <li><strong>고용수</strong>: 38</li>
</ul>

<hr />

<h2 id="대표-서비스">대표 서비스</h2>

<ul>
  <li><strong>수학 과외</strong></li>
</ul>

<hr />

<h2 id="제공-서비스">제공 서비스</h2>

<ul>
  <li>프로그래밍/코딩 레슨</li>
  <li>수학 과외</li>
  <li>영어 과외</li>
  <li>일본어(일어) 과외</li>
  <li>JLPT 과외</li>
  <li>JPT 과외</li>
  <li>TOEIC/speaking/writing 과외</li>
  <li>대학입시 컨설팅</li>
  <li>대학입시자소서 컨설팅</li>
  <li>데이터분석 레슨</li>
  <li>챗봇 개발</li>
  <li>인공지능(AI) 개발</li>
  <li>대학원입시 컨설팅</li>
  <li>매크로/VBA 개발</li>
</ul>

<hr />

<h2 id="경력">경력</h2>

<ul>
  <li><strong>총 경력</strong>: 7년</li>
</ul>

<h3 id="llm-서비스-기획-및-개발">LLM 서비스 기획 및 개발</h3>

<ul>
  <li><strong>기간</strong>: 2024년 1월 - 현재 · 11개월</li>
  <li><strong>내용</strong>: LLM 서비스 기획 및 개발</li>
</ul>

<h3 id="강사">강사</h3>

<ul>
  <li><strong>기간</strong>: 2017년 3월 - 현재 · 7년 9개월</li>
  <li><strong>내용</strong>: 수학 / 데이터분석 / 코딩 / 머신러닝 / 딥러닝 / 영어</li>
</ul>

<hr />

<h2 id="학력">학력</h2>

<ul>
  <li>
    <p><strong>Kyoto University</strong></p>

    <ul>
      <li><strong>기간</strong>: 2022년 10월 - 현재</li>
      <li><strong>전공</strong>: Intelligence Science and Technology</li>
    </ul>
  </li>
</ul>

<hr />

<h2 id="고수-서비스-상세설명">고수 서비스 상세설명</h2>

<p>안녕하세요! 저는 작년 3월 일본 소재 대학 수학과를 졸업하고, 현재는 교토대학 대학원 정보학연구과 석사로 재학하며 인공지능과 데이터분석에 관하여 공부중입니다.<br />
학부 4학년 때 배정받은 연구실이 통계 연구실이었기 때문에 통계학 관련하여 공부를 많이 한 상태입니다.<br />
학부 시절 통계 연구실에서 다양한 과제를 수행하며 통계학에 대한 깊은 이해를 기르고, 체계적인 대학수학 인강 교재를 통해 중학교 수학부터 대학수학까지 지도할 수 있는 역량을 갖추게 되었습니다.<br />
고등학교 시절 수학 모의고사와 수능에서 수학 과목의 성적의 90% 이상이 100점이었으며, 주변 친구들에게 과외를 일상적으로 진행할 정도로 가르치는 것을 굉장히 좋아합니다. (생활기록부 증빙 가능)</p>

<p>수학 교육에 대한 열정을 가지고 있어, 유튜브와 인터넷 커뮤니티 등을 통해 고등수학과 다양한 수학 문제를 접하며 감각을 유지하고 있습니다. 저의 전공과 이러한 경험을 바탕으로 문제풀이의 정석과 꿀팁을 전수해 드리겠습니다.</p>

<p>제 지도력을 더욱 빛나게 하는 다양한 경험들과 과외방식을 소개합니다:</p>

<ol>
  <li>Columbia University, UCLA, Boston University 등 미국 유수 대학에 재학중인 대학(원)생들을 지도한 경험이 다수 있습니다. (통계학, 미적분학, 선형대수학, Merchandising Mathematics 등)</li>
  <li>일본유학시험 (EJU) 수학 인강 런칭</li>
  <li>일본 명문 사립대 와세다대학 학생의 대학미적분학(해석학) 시험 대비 수업을 진행한 경험이 있습니다.</li>
  <li>일본 대학 입시를 준비 중인 학생의 수학 기초부터 문제 풀이까지 장기간 지도한 경험이 있습니다.</li>
  <li>인공지능 대학원 석사 과정 면접 대비 강의를 진행한 경험이 있습니다.</li>
  <li>파이썬을 이용한 데이터 분석, 딥러닝, 머신러닝을 위한 수학에 대해서 가르친 경험이 있습니다.</li>
  <li>서울대, 연세대, 고려대, 중앙대, 한국외대, 서울시립대, 건국대 등 서울 유수의 대학생들의 과외 및 과제 지도 경험이 다수 있습니다.</li>
  <li>네이버 AI 모델러 직무 코딩테스트 및 면접(NLP, LLM 관련) 대비 수업을 진행한 경험이 있습니다.</li>
  <li>현대오토에버 AI 서비스 개발_대화/언어 서비스 직무 면접 대비 수업을 진행한 경험이 있습니다.</li>
  <li>중학교 수학 1-3학년 과정 전체를 지도한 적이 있습니다 (초등학생 선행).</li>
  <li>고등학교 3학년 재학 당시 규토 수학 고득점 N제 2017 검토</li>
  <li>현재 대성 수학 전과목 모의고사 검토 진행중</li>
  <li>정상모 수학 강사 교재, 모의고사 검토 경험 다수</li>
  <li>일본 이공계 대학 유학 중인 학생의 과제를 도와드린 적이 있습니다.</li>
  <li>대학미적분학, 선형대수학, 수리통계학, 미분방정식, 해석학, 경영통계 등 여러 분야에서의 문제풀이 진행 경험 多 &amp; 현재도 활발히 진행중</li>
  <li>카카오톡 오픈채팅방 수학 질문답변방(2023년 10월 기준 약 180명의 인원. 대부분은 고등학생이지만 초등학생, 중학생, 대학생도 존재)의 부방장으로서 활동하며 초중고 수학부터 대학수학까지 다양한 범위의 수학에 대한 답변을 하며 활발하게 활동중</li>
</ol>

<p>&lt;과외 방식&gt;</p>

<ol>
  <li>모든 과외는 Zoom을 통해 진행하며, tldv 녹화 도구를 이용해 수업 내용을 쉽게 복습할 수 있습니다 (원하지 않을 경우 녹화하지 않음). AI 요약과 자막 검색이 가능한 링크를 제공해 드리며, 수업이 끝난 직후 바로 동영상에 접근할 수 있습니다 (사전에 미리 양해를 구함).</li>
  <li>수업 시간에 작성한 모든 필기는 PDF 등의 자료로 제공해 드립니다.</li>
  <li>태블릿이 있는 경우 Zoom 화이트보드를 통한 쌍방 필기, 또는 Conceptboard라는 사이트를 이용한 쌍방 필기가 가능합니다. (PDF 파일을 업로드하여 필기 가능)</li>
</ol>

<hr />

<p>어떠한 형태로든 최대한 맞춰드리니 편하게 연락주세요! 제 지도력을 믿고 맡겨주시면, 최선을 다해 꼼꼼하게 지도해 드리겠습니다.</p>

<p>함께 성장하는 즐거운 과외 시간을 만들어봅시다!</p>

<hr />

<h2 id="qa">Q&amp;A</h2>

<h3 id="q-서비스가-시작되기-전-어떤-절차로-진행하나요">Q. 서비스가 시작되기 전 어떤 절차로 진행하나요?</h3>

<p>숨고 채팅, 카톡 대화, 전화 상담, 줌 상담 등을 통해 고객님의 요구사항을 충족시켜드릴 수 있는지 충분히 사전 점검을 진행합니다.</p>

<h3 id="q-어떤-서비스를-전문적으로-제공하나요">Q. 어떤 서비스를 전문적으로 제공하나요?</h3>

<ul>
  <li>
    <p><strong>수학 과외</strong></p>

    <p>중고등수학 과외, 대학수학 과외, 편입수학 과외 및 공부 방향성 등에 대한 조언을 해드립니다.</p>
  </li>
  <li>
    <p><strong>외주</strong></p>

    <ul>
      <li>LLM을 이용한 각종 서비스(챗봇, 이미지 생성 툴, PDF 요약 툴 등) 개발 및 배포</li>
      <li>노코드 툴을 이용한 웹페이지 제작 및 배포</li>
      <li>각종 업무 자동화(make, Google Apps Script, Excel VBA 등) 구축</li>
    </ul>
  </li>
  <li>
    <p><strong>데이터분석, 코딩 과외</strong></p>

    <p>취미, 대학원 공부, 기업 면접 등 어떤 목적이든 목적에 맞춰서 수업을 진행합니다.</p>
  </li>
  <li>
    <p><strong>영어 과외</strong></p>

    <p>고등 영어, 토익 등 대부분의 영역에서의 Reading &amp; Listening 수업을 진행합니다.</p>
  </li>
</ul>

<h3 id="q-서비스의-견적은-어떤-방식으로-산정-되나요">Q. 서비스의 견적은 어떤 방식으로 산정 되나요?</h3>

<p>과외학생의 수준, 원하는 수업 방식의 준비 난이도 등의 기준에 따라 책정됩니다.</p>

<h3 id="q-완료한-서비스-중-대표적인-서비스는-무엇인가요-소요-시간은-얼마나-소요-되었나요">Q. 완료한 서비스 중 대표적인 서비스는 무엇인가요? 소요 시간은 얼마나 소요 되었나요?</h3>

<p>최근 대학수학과외의 경우 한달 반이라는 기간을 통해 과외 학생이 첫 미적분학 시험에서 원하는 성적을 거두게 하였습니다.</p>

<h3 id="q-as-또는-환불-규정은-어떻게-되나요">Q. A/S 또는 환불 규정은 어떻게 되나요?</h3>

<p>과외비는 주급 선불(그 주 시작 수업일 전날까지 납부)이 원칙입니다.<br />
과외 당일의 경우(수업 시작 전 6시간 미만이 남은 경우) 환불은 불가합니다.<br />
대신 사전 취소 및 변경의 경우 수업 시작 6시간 전까지 전액 환불을 보장해드립니다.</p>]]></content><author><name>Park Joon</name></author><category term="self" /><category term="soomgo" /><category term="숨고" /><category term="프로필" /><summary type="html"><![CDATA[박준]]></summary></entry></feed>