Agentic AI 애플리케이션 테스트를 5시간에서 10분으로 단축했습니다.

hackernews | | 🔬 연구
#agentic ai #ai 테스트 #ai 평가 #llmsec #review #보안 #평가 엔진
원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

1. **Analyze the Request:** * **Role:** News summarization expert. * **Task:** Summarize the provided article. * **Constraints:** * Concise (2-4 sentences). * Focus on most important information. * DO NOT copy-paste. * DO NOT simply restate the title (must add new info). * Language: English. * **Input:** A title ("I Reduced 5 hours of Testing my Agentic AI applcaition to 10 mins") and source (hackernews), but *crucially* the "Article Text" is missing. It just repeats the title. 2. **Analyze the Input Data:** * Title: I Reduced 5 hours of Testing my Agentic AI applcaition to 10 mins * Source: hackernews * Article Text: [Empty/Repeats Title] 3. **Identify the Problem:** The user provided a title but no actual article text to summarize. 4. **Formulate a Strategy:** * Since there is no content to summarize *beyond* the title, I cannot fulfill the "Add new information beyond the title" constraint. * However, looking at the prompt structure, it seems like a template where the user *forgot* to paste the body of the article, or perhaps the "Article Text" field provided in the prompt is literally just the title repeated. * *Self-Correction/Check:* Let me look closer at the prompt provided. * "Article Text: I Reduced 5 hours of Testing my Agentic AI applcaition to 10 mins". * Okay, the article text provided *is* just the title. There is no body content. 5. **Determine the Response:** * I cannot generate

본문

A comprehensive Evaluation, Testing, and Security engine for Agentic AI applications. LLMSec is an advanced framework that acts primarily as a Testing and Evaluation Engine for Agentic AI applications, while also supplying a robust suite of Security Testing features. Define comprehensive testing environments ("Bots" or "Targets") by providing text explanations of their purpose, enabling our internal agents to autonomously analyze, evaluate, and attack your chat AI interfaces. Whether testing against a direct REST API or intercepting a web-based Chat UI using our Chrome Extension, LLMSec orchestrates everything from functional Use Cases to sophisticated Multi-Turn adversarial attacks. - Bot Context Engine: Define your target model's purpose and limits so LLMSec agents know precisely how to interact with it. - Use Cases & Test Cases: Manually build or automatically compose hierarchical Use Cases and individual Test Cases. - Evaluation Scoring: Let the engine analyze the AI response and assign standard quantitative evaluation scores. - Ground Truth Pipeline: Store historical execution results and lock in validated runs as regression "Ground Truth Data". - Adaptive Execution: The LLMSec agent halts execution to ask the human user clarifying questions if it lacks the specific context needed to grade or execute a test. - Advanced Attack Vectors: Run Prompt Injections, Role-Playing (DAN/STAN), Persuasion, Encoding, and Storyboarding attacks. - Sequential Attacks: Coordinate multi-turn conversational social engineering attacks that dynamically adapt to the target's defenses. - REST API: Direct server-to-server testing. - Browser Extension: A powerful Chrome Extension that interacts directly with any DOM-based web chat application, bypassing complex authentication and API mocking entirely. - Python 3.9+ - Node.js 16+ (for web dashboard) - Google Chrome (for browser extension) cd backend python -m venv venv source venv/bin/activate # Windows: venv\\Scripts\\activate pip install -r requirements.txt cp .env.example .env # Start the FastAPI Server uvicorn main:app --reload The API is now available at http://localhost:8000 (Swagger UI at /docs ) cd frontend npm install npm run dev Visit the dashboard at http://localhost:3000 - Open Chrome and navigate to chrome://extensions/ . - Enable "Developer mode" in the top right. - Click "Load unpacked". - Select the extension/ directory from this repository. - Pin the extension. When you click it, set the Backend URL to http://localhost:8000 and click "Connect". For detailed guides on setting up, using the Chrome Extension, and exploring the backend models, please check our documentation folder. - Architecture Deep Dive: Explore the system architecture, file structure, component breakdown, and Database schemas. - Installation & Troubleshooting Guide: Detailed dependency setup, Docker coming soon, and deep troubleshooting steps for Backend, Frontend, and the Database. - Detailed Usage Guide: Master the API, build Custom Scenarios, trigger Dataset Generation pipelines, and read Security Evaluation Analytics. - Always test using pytest in the backend before pushing changes. - Ensure Prettier formatting is run for Frontend modifications. - Run npm run lint inside theextension folder to validate Chrome V3 standards. This tool is strictly designed for authorized security testing. Always ensure you have explicit, written authorization before testing any system you do not own. Unauthorized security testing may be illegal. Licensed under the MIT License.

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

공유

관련 저널 읽기

전체 보기 →