Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Page Not Found

Page not found. Your pixels are in another canvas.

Jie Yu's Personal Website on Github

About me

Jupyter notebook markdown generator

Posts

Understanding LLM through the LLaMA Models

29 minute read

Published: April 25, 2024

The LLaMA (Large Language Model Meta-AI) model (arxiv:2302.13971), unveiled by Meta AI in February 2023, stands as a remarkable achievement in the realm of large language models, showcasing a capacity for human-like comprehension and text generation.

Understanding Optimizers in ML Model Training

6 minute read

Published: July 05, 2023

An optimizer in the context of machine learning is a crucial component responsible for adjusting the parameters of a model during the training process to minimize the error or loss function.

portfolio

Better Human with Machine Learning

While machine learning thrives on vast datasets and computational power, human learning operates with flexibility, intuition, and synthesis of knowledge. I have a personal interest in delving into the methodologies of both, emphasizing the adaptability of human cognition and the structured algorithms of machine learning. Ultimately, I’d love to have a deeper understand and envision a future where machine learning enhances human learning experiences, fosters innovation in education, and unlocks new frontiers of human potential.
(Image Source)

Exploring the Data Centric Approach in Machine Learning

The data-centric approach is intriguing and essential due to its profound impact on the effectiveness and reliability of machine learning models. By prioritizing data quality, preprocessing, and feature engineering, this methodology enhances model performance, interpretability, and scalability. It empowers practitioners to extract meaningful insights, identify relevant patterns, and make informed decisions, driving innovation and solving complex challenges across various domains. Ultimately, the data-centric approach maximizes the potential of machine learning by leveraging high-quality, relevant data to achieve superior results.
(Image Source)

projects

Detectron-2 Model Retraining

Published on March 22, 2022

This project utilizes Detectron2 for model retraining, aiming to enhance segmentation accuracy for specific objects, particularly humans, within the CoCo Dataset. It starts with Docker image setup for consistent deployment, filtering relevant categories, and balances instance distribution to optimize dataset for retraining, followed by semantic segmentation model retraining, culminating in the evaluation of the retooled model’s efficacy in image and video inference tasks.

From Depth Maps to 3D Point Clouds: Data Conversion

Published on April 03, 2022

This project introduces an open-source package that streamlines the conversion of depth map images recorded by a stage system into 3D point clouds, leveraging tools like Open3D, segmentation techniques, and integration with SMPL models for human motion capture and character animation applications.

Tiny Story Generator: Fine-tuning “Small” Language Models with PEFT

Published on February 06, 2024

The Tiny Story Generator project is an exciting development in the field of natural language processing. By fine-tuning a small language model on short stories, the project demonstrates the possibility of generating high-quality narratives using machine learning techniques. The project’s use of the PEFT technique and its focus on short stories make it a unique and innovative approach to language generation. The project’s code and data are publicly available, making it an excellent resource for researchers and enthusiasts interested in natural language processing and machine learning.

Document Parsing and Question Answering with LLMs Served Locally

Published on April 22, 2024

This project facilitates local document parsing and question answering using large language models (LLMs), encompassing document parsing, text chunking, vectorization, prompting, and LLM-based question answering, all orchestrated through a streamlined process and Dockerized environment, offering benefits in privacy, cost efficiency, educational value, customization, and scalability, with potential use cases spanning various domains such as enterprises, research institutions, legal firms, and educational settings, leveraging tools like Docker, Unstructured, FAISS, Langchain, and Llama.cpp for seamless setup and operation.