I am applying to London and can obtain a UK YMS Visa as a South Korean citizen, which grants me the authority to work in the UK.
I can obtain a UK YMS Visa as a South Korean citizen, which grants me the authority to work in the UK for 2 years.
from a LinkedIn job posting
Why are you interested in Replit?
My passion for integrating AI features into existing services aligns perfectly with Replit's mission to build the world's most ubiquitous programming environment. Replit stands at the innovation forefront in IDE code generation, a space I've been closely monitoring through my experiences with various services such as Tabnine, GitHub Copilot, Tabby, and AlphaCodium. My keen interest in AI engineering, specifically for code generation, convinces me of the meaningful contributions I can make at Replit. The platform's goal of merging learning with creating aligns with my aim to integrate AI in ways that support and improve the work of developers around the world.
What makes you particularly well suited for this role?
AI code generation tasks are cost-intensive mainly because source code, with its special characters, requires more LLM tokens than natural language processing for a given context window. My work on the GitGPT project, which focused on Code Generation Techniques, involved researching methods like Chain of Code and applying Retrieval-Augmented Generation (RAG) to reduce token usage. I believe that an AI code executor is crucial for practical AI code development, aiming to replicate human coding behaviors and efficiently distribute tasks among AI agents.
I have managed GCP GKE, handling all aspects of deployment environments for a year and am comfortable with cloud container environments. Beyond managing GKE, I have implemented a Slack error notification bot using InfluxDB and Grafana. My expertise extends across various AI backend and frontend frameworks, including PyTorch and Hugging Face's transformers. Additionally, my experience with web-based AI applications enables me to contribute quickly to implementations and focus on enhancing the user experience through web text streaming for an intuitive user interface.
I keep up with AI optimization techniques, including quantization with GPTQ and AWQ, and have applied them in my projects with prompt optimization to reduce costs. Additionally, I value knowledge sharing, systematically organizing information in Notion, which supports documentation and team collaboration. This blend of technical expertise and practical experience makes me a suitable candidate for Replit's innovative environment.
What is a project or piece of work you are most proud of and why?
MBTI-GPT, a Web AI service that leverages social data to deliver MBTI types with explainable reasoning, stands out as one of my most proud achievements. This project transcends traditional self-assessment methods by employing generative AI to draw meaningful correlations between online behavior and personality traits, with support from the field of personality psychology. My contribution involved the innovative application of real-time RAG indexing and the strategic use of OpenAI's JSON mode for structured responses, significantly improving the user experience. Beyond these technical advancements, the core of this project was to facilitate a deeper understanding of oneself and others. It demonstrates my commitment to leveraging AI in novel ways to create impactful, user-centric products that foster enhancing self and mutual understanding.
Cohere
What technical expertise makes you a good candidate for the Forward Deployed Engineering team?✱
I firmly believe in the significance of Retrieval Augmented Generation (RAG) for creating practical AI services, emphasizing the need for efficient task distribution among AI agents by mimicking human behavior. Optimization plays a crucial role in the effective utilization of specialized AIs. I have consistently followed up on AI optimization techniques, applying quantization methods like GPTQ and AWQ to projects, achieving computing cost reductions.
During my time at Kakao Mobility, I managed the network including GCP GKE (Google Kubernetes Engine) for a year, gaining experience in cloud Kubernetes container environments. I also deployed InfluxDB and Grafana in a GKE to implement a Slack error notification bot. Lastly, my previous company experience has made me aware of the importance of knowledge sharing, and I use Notion everyday to organize knowledge in a tree structure, which can positively impact the documentation and collaboration culture within your team.
Please provide an example of starting with a customer’s needs and working backward to build a solution?✱
MBTI-GPT is a Web AI service that leverages social data to deliver MBTI types with explainable reasoning. MBTI is highly trendy in South Korea, consistently engaging cultural interest. People enjoy linking their behaviors to their MBTI traits, and I created a service that intriguingly connects these aspects using Generative AI. This project surpasses traditional self-assessment methods by employing generative AI to establish meaningful connections between online behavior and personality traits, underpinned by personality psychology. My role encompassed the innovative application of real-time RAG indexing and the strategic implementation of OpenAI's JSON mode for structured responses, seamlessly integrated into the UI without the need for post-processing. Beyond these technical enhancements, the core of this project was to facilitate a deeper understanding of oneself and others, demonstrating my commitment to using AI innovatively to develop impactful, user-centric products that enhance self-awareness and mutual understanding.
Upstage
업스테이지 'Data Research Engineer Internship' 포지션에 지원하신 동기와 생각하시는 커리어 방향에 대해 자유롭게 기술해주세요. *
업스테이지의 AI 산업에 대한 기여와 활동, 특히 LLaMa2 Fine tuning을 통한 성과와 KoLLM 리더보드에서의 활동에 대해 좋은 인상을 가지고 있습니다. 저는 RAG, Function call과 같은 LLM 모듈과 Chat, Instruction 들을 서비스 형태로 잘 이해하고 있으며, 이를 바탕으로 업스테이지가 추구하는 인재상에 부합한다고 생각합니다. 또한, LLaM2GPTQ와 같은 Private LLM 개발 경험이 있으며, AI가 사용자 맞춤형으로 발전해야 한다고 믿습니다. 이를 위해 다양한 모듈과 파라미터 튜닝을 통해 개인화된 AI 제품 개발에 기여하여 삶에 밀접한 AI 생태계에 기여하고 싶습니다.
지원 직무와 관련된 업무 경력(또는 경험)에 대해 자유롭게 기재해주세요. *
실용적인 AI service를 위해 function call과 RAG가 중요하다고 믿으며, 인간 행동을 모방해 AI 에이전트 간에 작업을 효율적으로 분배해야 합니다. 다양한 특화된 AI의 효율적인 사용을 위해 최적화를 중요하게 생각합니다. AI 최적화 기술을 지속적으로 follow up 하고 있으며, GPTQ 및 AWQ와 같은 양자화 기법을 프로젝트에 적용하여 computing cost 절감을 실현했습니다. 또한 PyTorch 및 HuggingFace transformers를 포함한 여러 AI 백엔드 및 프론트엔드 프레임워크에 대한 지식을 가지고 있습니다.
MBTI-GPT는 최근 개발한 AI Web service입니다. 서비스는 소셜 데이터를 활용하여 MBTI 유형을 근거와 함께 제공합니다. 서비스는 기존 MBTI의 자가 평가 방법을 넘어, 온라인 행동과 성격 특성 간에 의미 있는 상관관계가 있다는 사실에 기반하여 Gen AI를 활용했습니다. 기술적으로 실시간 RAG 인덱싱과 OpenAI의 JSON 모드를 전략적으로 사용하여 UX를 크게 개선했으며, 이를 통해 사용자가 자신과 주변인들을 더 깊은 이해할 수 있도록 도왔습니다.
카카오 모빌리티에서 개발당시 GCP GKE의 네트워크를 포함해 1년 동안 관리하여, 클라우드 Kubernetes 컨테이너 환경에 대한 경험을 쌓았습니다. 또한 GKE환경에 InfluxDB와 Grafana를 배포해 Slack 오류 알림 봇을 구현했습니다. 마지막으로 저는 기존의 회사 경험을 통해 지식 공유의 중요성을 인지하고 있고, 평소에 Notion을 활용해 정보를 트리 구조로 체계적으로 정리하기 때문에, 팀내 문서화와 협업 문화에 좋은 영향을 끼칠 수 있습니다.
Browser Company
Which programming languages/technologies are you familiar with and rate your level of experience in each (beginner, comfortable, expert).
- Typescript: Expert - Integrated C++ and Rust library into Node.js runtime with NAPI binding for performance enhancements. Possesses extensive experience in monorepo management for efficient development workflows and has deployed numerous npm packages, available at npmjs.com/~seonglae, gaining a deep understanding of the JavaScript module-based web development ecosystem through experiences with various repositories like pnpm and yarn berry. Understands the workings of the JavaScript Engine, including libuv and v8, and is skilled in writing efficient and readable code using JavaScript's async generators, proxies, and TypeScript's generics and abstract classes.
- Python: Expert - Proficient in using Python across a broad spectrum of applications, from web development to automation and data analysis. Competence in utilizing Python's standard libraries, such as
functoolsandasyncio, is considered fundamental for enhancing code readability and collaboration. Extensive experience in deploying pip modules through pyproject, managing project dependencies, and virtual environments with various package managers to ensure adherence to Python development best practices. Additionally, understanding of coroutines and future tasks enables writing more efficient Python code.
- Kubernetes: Comfortable - Solid experience in managing GCP GKE environments, including deployment, scaling, and management of containerized applications. Configured network in GKE from domain name certificates to internal network flow. Additionally, developed an efficient workflow through the 3-level namespace division for development, testing, and deployment environments and managed batch jobs using Kubernetes job scheduling. Contributed to efficient infrastructure management through the convenience of container-based deployments.
What are your top 3-5 skills? (e.g. Swift, app performance, ML modeling, etc.)
- Retrieval Augmented Generation (RAG): In the ReSRer project, I indexed 21 million Wikipedia passages into a Milvus vector database within 10 hours, demonstrating efficient large-scale data management. This project highlighted my ability to enhance text embedding generation speeds by up to 300% using asynchronous HTTP requests and a Text Embedding Inference server, all with just two RTX 3090 GPUs, significantly improving upon the same-device environment performance. In the MBTI-GPT project, I implemented real-time indexing to in-memory Faiss, enabling secure data manipulation with RAG. The Texonom project saw me indexing and implementing a search over 30,000 pages using an indexed pgVector, where I evaluated several indexing and embedding models to find the most efficient solution for the given use case.
- Prompt Optimization: During the ReSRer project, I utilized prompt scoring for ODQA, guided by the principles from the "LLM as optimizers" paper, showcasing my ability to leverage theoretical concepts for practical application enhancements. My involvement in MBTI GPT further exemplifies my skills in prompt optimization, achieving a 30% reduction in OpenAI API costs through meticulous prompt refinement in JSON mode. These efforts demonstrate my competence in crafting and refining prompts to secure precise and relevant outputs from language models, optimizing AI interactions within constrained context windows.
- Machine Learning Performance Optimization: The ReSRer project included deploying a multi-GPU inference server for accelerated text generation for LLaMa2 and Zephyr, highlighting my skill in enhancing ML model operations for improved efficiency and performance. Additionally, my work in the LLaMa2 GPTQ project, where I significantly reduced computing costs and memory usage by 75% through the quantization of a 4-bit GPTQ model, showcases my expertise in advanced ML model optimization techniques, ensuring high-performance AI applications with optimized resource usage.
What piqued your interest in The Browser Company?
My interest in The Browser Company is driven by its innovative use of external black box LLM API, an area where I see significant opportunities for contribution. The application of vector databases for embedding and prompt optimization within limited context windows is crucial for enhancing black box LLM. My experience with the ReSRer project, where I successfully indexed a large-scale 21M texts to vector database on Wikipedia, has equipped me with the skills necessary for managing and optimizing large datasets and embedding models for real-time applications.
Furthermore, my work on the MBTI GPT project has honed my ability to approach AI product development creatively, something I believe is pivotal for contributing to The Browser Company's vision of a revolutionary browser experience. I am confident that my background in leveraging vector databases and optimizing prompts, coupled with my innovative approach to AI product development, aligns well with The Browser Company's goals. I am excited about the possibility of applying my skills and experience to enhance real-time browsing experiences and contribute to the next wave of innovations at The Browser Company.
Naver
I view interpretability a critical aspect of both safety and alignment in artificial intelligence systems. An interpretable AI is not only transparent but also easier to steer, contributing to better alignment with intended goals. Achieving alignment is crucial, and interpretability provides the means to ensure that the AI system operates as intended.
In the pursuit of alignment, several methods for preference optimization exist, such as dataset manipulation, reinforcement learning, and prompt optimization. These techniques allow for fine-tuning and steering the AI toward desired outcomes, providing a level of control and alignment akin to human decision-making. An example of this is ControlNet, emphasizing that only controllable AI can achieve real usefulness. In summary, my experience and understanding of alignment and interpretability would be valuable for the team. I am eager to contribute by developing guardrail software for LLM.
Projects
ReSRer
2023.9 ~ 2024.1
- Integrated OpenAI's GPT-3.5 LLM into a retrieval-based QA framework and evaluated it on the Natural Questions dataset.
- Indexed 21 million Wikipedia passages in 10 hours for the ReSRer project, leveraging a Milvus vector database.
- Enhanced text embedding generation speed by 300% using asynchronous HTTP requests and a dedicated inference server.
- Utilized prompt scoring guided by "LLM as optimizers" insights in ReSRer, improving ODQA performance 20%.
- Deployed a multi-GPU inference server in ReSRer, accelerating text generation for LLaMa2 and Zephyr models.
MBTI GPT
2023.10 ~ 2024.2
- Achieved a 30% reduction in OpenAI API costs for MBTI GPT through precise prompt refinement.
- Developed a GPT-based analyzer for evaluating KakaoTalk chats, offering insights with evidence-based scoring.
- Innovated real-time RAG indexing to in-memory Faiss to improve prompt efficiency, ensuring secure, accurate data retrieval. https://mbti.texonom.com/result/e2l462homr83
RTSum
2023.3 ~ 2023.8
- Minimized AI hallucination by decomposing sentences into smaller units and recomposing them.
- Boosted relation-triple extraction speed 300% by combining a reverse proxy and container scaling. https://arxiv.org/abs/2310.13895
ReSRer
2023.9 ~ 2024.1
- Deployed a multi-GPU local inference servers in ReSRer, accelerating text generation for LLaMa2 and Zephyr models (TGI, TEI).
Hannam
2024.1 ~ 2024.2
- Fine-tuned Gemma for a Korean Chat dataset using SFT with the PEFT QLoRa adapter, after initially training it with full parameters on Korean Wikipedia and SNS comments. https://huggingface.co/seonglae/hannam-md
LLaMa2GPTQ
2023.6 ~ 2023.8
- Optimized local chat model’s computing and memory usage through GPTQ 4-bit model quantization. https://llama2gptq.nuxt.space/
Texonom
2020.6 ~ 2022.3
- Integrated a recommender system into Web backend using ONNX & transformers.js for inference.
- Deployed Texonom action to GPT Store https://chat.openai.com/g/g-rDt50Ud94-texonom
- Embedded 30,000 documents within a knowledge system for RAG-enabled vector search. https://texonom.com/
Skills
RAG (Expert)
- Indexed 21 million Wikipedia passages in 10 hours for the ReSRer project, leveraging a Milvus vector database.
- Enhanced text embedding generation speed by 300% using asynchronous HTTP requests and a dedicated inference server.
ML Performance Optimization (Expert)
- Utilized prompt scoring guided by "LLM as optimizers" insights in ReSRer, improving ODQA performance 20%.
- Deployed a multi-GPU inference server in ReSRer, accelerating text generation for LLaMa2 and Zephyr models.
Tools
Pytorch (Comfortable)
- Proficient in tensor manipulation, employing broadcasting and a variety of built-in Pytorch functions.
- Knowledgeable about Pytorch 2.0's Graph mode and experienced in distributed training methods, such as DDP and FSDP.
Python (Expert)
- Skilled in using Python's standard libraries, for example, dataclasses, functools and asyncio, to improve code quality and teamwork.
- Experienced in creating and deploying pip modules, setting up virtual environments using different package managers like rye.
HF ecosystem
- Fine-tuned Gemma for a Korean Chat dataset using SFT with the PEFT QLoRa adapter, after initially training it with full parameters on Korean Wikipedia and SNS comments.
- Implemented multi-GPU local inference servers using TGI TEI to speed up inference.
Kubernetes (Comfortable)
- Operated service on GCP GKE environment, configured network, utilized namespaces for environment separation, and automated tasks with Kubernetes job scheduling.
Tiktok
Hello, I am Seonglae Cho, but you can call me Alan Jo comfortably. I have experience in developing a Recommendation system based on embedding vectors from scratch at Texonom service, and I believe this experience can add dynamics to the team spectrum. I am very much looking forward to meeting good team members and working together in London. Best regards.
Proton
Why do you think this role is a good fit for you? *
Experience in model training and performance optimization in NLP.
What it is about Proton that excites you? *
Proton's evolution from email to a broad range of software services excites me. I'm eager to apply my software development passion and skills at Proton. I believe my diverse and up-to-date experiences with Language models can broaden the team's spectrum and contribute to the Proton classification model.
Based on my experience with the RTSum project, breaking down email content into relation triples could lead to more accurate email classification performance. Having extensive experience in web development, I feel I can be seamlessly integrated into the email classification development immediately. I am very much looking forward to meeting good team members and working together in London. Best regards.
Notion
Joining Notion has been a goal of mine since becoming a dedicated user in early 2019. As a big fan who has extensively used your services, including the recent AI features, I'm thrilled at the opportunity to contribute to a platform that has been a significant part of my life and workflow. My experience with Notion extends beyond everyday use; I've led two Notion-based projects, making me well-versed in the Notion API, and collections and views data structures. Particularly, my recent project, Texonom.com, successfully integrated AI technology with Notion's search and recommender systems, showcasing my ability to innovate within your ecosystem. Additionally, my familiarity with React, TypeScript, Node.js, and Postgres aligns well with the tech stack at Notion, further indicating my readiness to contribute effectively from day one. The prospect of contributing to Notion, a tool that has become my second brain and a crucial part of my Zettelkasten system, excites and motivates me deeply. My experiences makes me a perfect fit for the AI Product Engineer role, eager to bring dynamic and innovative ideas to your team.
Naver
저는 AI 분야에서 강점을 가지고 있습니다. 모델의 성능 최적화를 통한 다양한 AI 서비스 개발 경험을 가지고 있습니다. 최신 AI 기술 트렌드를 빠르게 파악하고, 이를 프로젝트에 적용해 보았습니다. 프로젝트를 진행하며 OpenAI API, LLaMA2와 같은 대규모 언어 모델을 활용한 경험과 GPT2 코드를 line by line으로 작성하며 Transformer model과 Attention mechinism에 대한 깊은 이해를 얻었습니다.
기술적 강점을 지니면서, AI에 대한 특이점주의자인 저는 LLM 발전에 대한 기대와 걱정을 동시에 가지고 있습니다. 예전부터 인공지능이 인류에 큰 변화를 가지고 올거라는 비전을 가지고 컴퓨터과학분야에서 다방면으로 공부해왔습니다. 작년 GPTQ qunatization이나 LLM context extending등의 기술을 공부하며 깨달은 점은, 결국 중요한 건 AI 모델링과 학습 데이터라는 것입니다. 아무리 RAG를 잘해도 결국 모델 설계와 학습 데이터가 없으면 쓸모없게 되버립니다. 좋은 데이터를 모델에 학습시키도록 기여하고 싶고, 네이버에서 그런 목표를 이룰 수 있다고 생각하여 지원하게 되었습니다.
ReSRer
2023.9 ~ 2024.1
ReSRer 프로젝트의 시작은 AI 모델을 사용하며 context length의 제한에서 오는 답답함이었습니다. 요약이라는 방식을 활용해 LLM의 context 한계를 극복할 수 있겠다는 아이디어가 떠올랐습니다. 이런 생각을 Open Domain Question Answering이라는 연구 Task에서 증명할 수 있겠다는 생각에 학과 교수님과 프로젝트를 진행하였습니다.
프로젝트의 첫 번째 태스크는 대규모 데이터셋에서 효율적으로 정보를 검색하는 것이었습니다. 이를 해결하기 위해, 저는 Milvus 벡터 데이터베이스를 사용하여 21백만 개의 위키피디아 문서를 10시간 이내에 인덱싱하는 성과를 달성했습니다. 이 과정에서 비동기 HTTP 요청과 전용 추론 서버를 활용하여 텍스트 임베딩 생성 속도를 300% 향상시켰습니다. 또 다른 문제는 언어 모델의 출력을 최적화하여 질문에 더 정확한 답변을 제공하는 것이었습니다. "LLM as optimizers" 논문의 통찰력을 바탕으로 프롬프트 스코어링 기법을 활용하였고, 이를 통해 도메인별 질의 응답(QA) 성능을 20% 개선했습니다. 마지막으로, 대규모 언어 모델을 효율적으로 운영하기 위해, 멀티-GPU 추론 서버를 배치하여 LLaMa2 및 Zephyr 모델의 텍스트 생성 속도를 가속화했습니다.
이 프로젝트를 통해 얻은 가장 중요한 교훈 중 하나는, 간단한 아이디어도 증명하기 위해서는 엔지니어링적으로 많은 태스크가 필요하다는 것입니다. 대규모 데이터를 효율적으로 관리하고, 모델의 출력을 정교하게 조정하며, computing 자원을 최적화하는 방법을 배웠습니다.아쉬웠던 점은, 프로젝트 초기에 정확한 방향성을 잡지 못해 디테일한 부분에 시간을 많이 빼았겼다는 점입니다. 이 경험을 통해, 초기 설계 단계에서부터 성능에 집착하기보다 체계적으로 계획한 뒤에 차근차근 개선해야겠다는 교훈을 얻었습니다. 그리고 장기간 프로젝트를 진행하며, 비슷한 연구가 Naver Labs에서 출시되어 좌절을 겪으며 방향성을 수정하는 방법을 배웠습니다.
- MBTI GPT (https://mbti.texonom.com/result/e2l462homr83): KakaoTalk 채팅 분석을 통해 사용자의 MBTI 유형을 예측하는 GPT 기반 분석기를 개발했습니다. 이 프로젝트에서 저는 정밀한 프롬프트 수정을 통해 OpenAI API 비용을 30% 절감하는 성과를 달성했으며, 실시간 RAG 인덱싱을 메모리 내 Faiss에 적용하여 프롬프트 효율성을 개선했습니다.
- RTSum (https://arxiv.org/abs/2310.13895): 문장을 더 작은 단위로 분해하고 재구성하여 AI 환각을 최소화하는 프로젝트입니다. 프로젝트를 진행하며 OpenIE Relation Triple 추출 속도를 300% 향상시켰습니다.
- Hannam (https://huggingface.co/seonglae/hannam-md): 한국어 채팅 데이터셋에 대해 Gemma 모델을 SFT와 PEFT QLoRa 어댑터를 사용하여 Fine tuning했습니다.
- LLaMa2GPTQ (https://llama2gptq.nuxt.space): 로컬 채팅 모델의 컴퓨팅 및 메모리 사용량을 GPTQ 4비트 모델 양자화를 통해 최적화했습니다.
- Texonom (https://texonom.com/): 웹 백엔드에 ONNX와 transformers.js를 사용하여 추천 시스템을 통합했으며, 30,000개 문서를 RAG 가능 벡터 검색을 위해 pgVector에 임베딩했습니다.
이 프로젝트들은 AI 기술을 활용하여 문제를 해결하고, 사용자 경험을 향상시키기 위한 노력을 보여줍니다.
Inflection AI
Describe your outstanding previous achievements:
One of my standout achievements includes spearheading the MBTI-GPT project, an AI-driven web service that interprets social data to predict MBTI types, linking online behavior with personality traits innovatively. This initiative was notable for its application of real-time Retrieval Augmented Generation (RAG) and the strategic deployment of OpenAI's JSON mode for structured responses, greatly enhancing user interaction. Another pivotal project was ReSRer, where I successfully indexed 21 million Wikipedia passages within 10 hours using a Milvus vector database, demonstrating my capabilities in handling vast data sets and boosting text embedding speeds by up to 300%. Additionally, my role as the first author in the RTSUM project, recognized in the NAACL 2024 Demonstration, marks a significant achievement. Here, I designed a Knowledge Graph (KG)-based experiment that validated the principles of Interpretable AI, achieving a threefold increase in NLP OpenIE5’s extraction speed through efficient system architecture optimizations. These experiences underscore my dedication to advancing AI technology with impactful, user-focused innovations and my ability to enhance AI's real-world applications through rigorous research and development.
One of my standout achievements is leading the RTSUM project, which was recognized in the NAACL 2024 Demonstration. As the first author, I orchestrated a Knowledge Graph (KG)-based experiment to validate the principles of Interpretable AI, significantly enhancing the extraction speed of NLP OpenIE5 by 300% through innovative system architecture optimizations. This achievement exemplifies my commitment to pushing the boundaries of AI towards greater interpretability and efficiency.
In addition, I spearheaded the MBTI-GPT project, an AI-driven web service that uses social data to predict MBTI types, innovatively linking online behavior with personality traits. This project stood out for its real-time Retrieval Augmented Generation (RAG) application and the strategic use of OpenAI's JSON mode for structured responses, significantly improving user interaction. Another critical endeavor was the ReSRer project, where I indexed 21 million Wikipedia passages in just 10 hours with a Milvus vector database, showcasing my ability to manage extensive datasets and accelerate text embedding generation by up to 300%.
These projects highlight my dedication to advancing AI technology through impactful, user-focused innovations and my adeptness at enhancing AI's practical applications through comprehensive research and development.
What are the contributions you’re excited to make?
At Inflection, I am excited to contribute to the unique AI product Pi and other innovative AI products by leveraging my expertise in AI model optimization, data curation, and user-centric design. My experience with large-scale data management, prompt optimization, and deployment of AI in web services aligns with Inflection's mission to create personal AIs that amplify the best aspects of humanity. I am particularly eager to explore novel model architectures and training techniques, optimize AI for real-world applications, and ensure these technologies are aligned with user well-being and happiness. I aspire to join a team that prioritizes excellence, ownership, and collaborative success, aiming to contribute to the development of groundbreaking yet ethically grounded AI technologies. I am motivated by the prospect of contributing to conversational AI, driven by a passion for continuous innovation and a commitment to creating universally beneficial AI solutions.
Could you please give a brief around your experience with A.I programming and coding
나는 python에 능숙하고 pytorch ecosystem에 distrbuted 와 torchrun torchtune에 대한 hands on 경험이 있다. pytorch를 기반으로 한 huggingface ecosystem의 tgi, tei를 이용하여 모델을 서빙하거나 accelerate와 peft를 사용하여 모델을 fine tuning하여 최적화하기도 한다. Rust에 익숙하고 onnx을 포함한 wasm 모델 서빙 경험이 있다.
I am proficient in Python and have hands-on experience with the PyTorch ecosystem, including distributed DDP and FSDP method and tools like torchrun and torchtune. I have utilized the Hugging Face ecosystem, particularly TGI and TEI, for serving models, and have also optimized model fine-tuning using Accelerate and PEFT. Additionally, I am familiar with Rust and have experience serving models with WebAssembly by ONNX runtime.