I built a real-time AR plane spotter, here's the math that makes it work

hackernews | | {'이벤트': '📰', '머신러닝/연구': '📰', '하드웨어/반도체': '📰', '취약점/보안': '📰', '기타 AI': '📰', 'AI 딜': '📰', 'AI 모델': '📰', 'AI 서비스': '📰', 'discount': '📰', 'news': '📰', 'review': '📰', 'tip': '📰'} news
#하드웨어/반도체

요약

I&#x27;ve been building an Android app that identifies aircraft overhead when you point your phone at the sky. The app fetches live ADS-B data and overlays aircraft labels on the camera feed, but getting the math right took much longer than I expected, so I wrote it all up.<p>The problem sounds simple, you have a GPS coordinate in the sky and a GPS coordinate in your hand. You want a pixel. But there are four distinct coordinate spaces between those two things, and the transitions between

왜 중요한가

개발자 관점

오픈 소스 ADS-B 데이터와 카메라 피드를 결합하여 GPS 좌표를 화면 픽셀 좌표로 정확히 매핑하는 좌표 변환 알고리즘의 구현 난이도를 보여줍니다.

연구자 관점

물리적 3D 공간의 위치 정보를 2D 이미지 평면에 정사영하여 실시간으로 오버레이하는 기하학적 문제 해결의 사례를 제시합니다.

비즈니스 관점

스마트폰의 보급된 센서와 외부 실시간 데이터 API를 활용하여 단순 엔터테인먼트를 넘어 위치 기반 서비스(LBS)의 시각화 가능성을 입증했습니다.

관련 엔티티

Android ADS-B

본문

I&#x27;ve been building an Android app that identifies aircraft overhead when
you point your phone at the sky. The app fetches live ADS-B data and
overlays aircraft labels on the camera feed, but getting the math right
took much longer than I expected, so I wrote it all up.<p>The problem sounds simple, you have a GPS coordinate in the sky and a GPS
coordinate in your hand. You want a pixel. But there are four distinct
coordinate spaces between those two things, and the transitions between
them have sign conventions that fail silently, wrong output with no error.</p><p>The pipeline:</p><p>
<code> Geodetic (lat, lon, alt)
↓ flat-earth approx — valid &lt;100 km, error &lt;2 px at 50 nm range
ENU — East, North, Up (metres)
↓ R⊤ from Android TYPE_ROTATION_VECTOR sensor
Device frame (dX, dY, dZ)
↓ one sign flip: Cz = −dZ
Camera frame (Cx, Cy, Cz)
↓ perspective divide + FOV normalisation
Screen pixels (Xpx, Ypx)
</code>
Why each transition is non-obvious:</p><p>Geodetic → ENU. The East component has a cosine factor that most
implementations miss: E = Δλ × (π·RE&#x2F;180) × cos(φ_user). Meridians
converge toward the poles, one degree of longitude is fewer metres at
latitude 25° than at the equator. Without it, East-West positions look
correct near the equator and quietly diverge as latitude increases.</p><p>ENU → Device frame. Android&#x27;s rotation matrix R maps device axes to ENU
world axes. To go the other direction you use R⊤. In Android&#x27;s row-major
FloatArray(9), this means column indices, not row indices:</p><p>
<code> R (forward): dX = R[0]·E + R[1]·N + R[2]·U
R⊤ (inverse): dX = R[0]·E + R[3]·N + R[6]·U
</code>
These produce completely different results. Both compile without complaint.</p><p>Device → Camera frame. Android&#x27;s sensor defines +Zd as pointing out of
the screen toward your face. The camera convention requires +Cz to point
into the scene. So Cz = −dZ, always. This is the only correction needed
for portrait mode.</p><p>Camera → Screen. After the perspective divide and FOV normalisation, the
Y axis flips: Ypx = (1 − NDCy) × H&#x2F;2. Camera +Cy is up; screen y=0 is
at the top. If we miss this, the aircraft above the horizon appears below screen
centre.</p><p>Real captured values (ATR72, 18,000 ft):</p><p>
<code> User: 24.8600°N, 80.9813°E
Aircraft: 24.9321°N, 81.0353°E

ENU: E=6,010 m N=8,014 m U=5,486 m
Bearing 34.2° (NNE), Elevation 29.5°, Range 11.1 km

Camera frame (after R⊤ + sign fix): (729, 4692, 10077)
Magnitude: 11,140 m ≈ 11,138 m (ENU range)

Screen (1080×1997, θH=66°, θV=50°): (600 px, 1 px)
</code>
Phone azimuth 33.0°, aircraft bearing 34.2° → 1.2° right of centre.
Phone pitched −4.3°, elevation 29.5° → net 33.8° up, just inside the
top edge of the frustum. Physically consistent throughout.</p><p>Happy to answer questions about any stage of the pipeline or about anything else, whatever is
interesting to anyone.</p>

관련 저널 읽기

전체 보기 →