Build software better, together

akarazniewicz / cocosplit

Star

Simple tool to split COCO annotations into train/test datasets.

coco deeplearning datapreprocessing

Updated Aug 15, 2023
Python

irenekarijadi / CEEMDAN-EWT-LSTM

Star

Wind Power Forecasting Based on Hybrid CEEMDAN-EWT Deep Learning Method

prediction artificial-intelligence lstm forecasting deeplearning renewable-energy datapreprocessing ceemdan

Updated Sep 28, 2023
Python

munsif200 / Transformer-Network-for-Short-term-Photovoltaic-Power-Generation-Forecasting

Star

Utilizes a Convolutional-based Transformer architecture for accurate and efficient PV power forecasting.

training datapreprocessing sequance-creations convolutional-based-transformer pv-forcasting energy-generation-forcasting

Updated Jan 22, 2024
Python

cereja-project / cereja

Star

Cereja is a bundle of useful functions we don't want to rewrite and .. just pure fun!

python console utilities progress-bar tokenizer python-library python3 colab file-converter data-tools tfidf hacktoberfest array-manipulations progress-view datapreprocessing freq freqitems hacktoberfest2024

Updated Dec 29, 2025
Python

ChenTaHung / Monotonic-Optimal-Binning

Star

Monotonic Optimal Binning algorithm is a statistical approach to transform continuous variables into optimal and monotonic categorical variables.

python statistics risk credit-card-fraud feature-engineering monotone mob credit risk-management monotonic pava credit-risk datapreprocessing data-discretization monotonic-optimal-binning monotone-optimal-binning pool-adjacent-violators

Updated Nov 6, 2025
Python

yakupkaplan / Home-Credit-Default-Risk

Star

In this project we try to predict home credit default risk for clients. We try to predict, if the client will have payment difficulties or not.

data-mining exploratory-data-analysis feature-engineering classification-model credit-risk datapreprocessing lightgbm-classifier

Updated Apr 2, 2022
Python

Karan-Malik / prepdata

Star

Automating the process of Data Preprocessing for Data Science

python data machine-learning random-forest numpy sklearn pandas python3 datascience pip classification preprocessing dataframe dataanalysis datapreprocessing pypi-package regress

Updated Jun 2, 2021
Python

KaramiMostafa / MachineLearningInHealthcare

Star

This repository focuses on two machine learning projects in the healthcare domain.

machine-learning image-processing feature-extraction classification data-processing parkinsons-disease regression-analysis melanoma updrs-scale datapreprocessing melanoma-detection

Updated Jul 4, 2023
Python

MohammedSaim-Quadri / Intrusion_Detection-System

Star

This project is an Intrusion Detection System (IDS) using machine learning (ML) and deep learning (DL) to detect network intrusions. It leverages the CICIDS2018 dataset to classify traffic as normal or malicious. Key features include data preprocessing, model training, hyperparameter tuning, and Docker containerization for scalable deployment.

docker deep-learning cybersecurity neural-networks intrusion-detection machinelearning bayesian-optimization hyperparameter-tuning datapreprocessing cicids2018

Updated Nov 19, 2025
Python

autolordz / docx-content-modify

Star

Python编写的处理法务邮单自动批量生成的脚本小工具-提取判决书内容免去手输填充邮单-Legal agency postal receipt automatically generate app

python generator excel xlsx office batch docx text-generator pythonscraping datapreprocessing

Updated Aug 9, 2022
Python

ENGRZULQARNAIN / ScrapySub

Star

ScrapySub is a Python library designed to recursively scrape website content, including subpages. It fetches the visible text from web pages and stores it in a structured format for easy access and analysis. This library is particularly useful for NLP and AI developers who need to gather large amounts of web content for their projects.

python scraper crawling scraping-websites python-package urllib3 datapreprocessing datapreparation

Updated Jul 14, 2024
Python

cybergeekgyan / ResumeScreening

Sponsor

Star

Resume Screening using Machine Learning and Python

python data-science machine-learning natural-language-processing exploratory-data-analysis datacleaning datapreprocessing resumescreening

Updated Jun 25, 2021
Python

FawziElNaggar / Question-MCQ-_Answer_Generation

Star

Building this project to generate MCQ Questions from any type of text and generate answers and distractors for it.

machine-learning natural-language-processing deep-learning spacy question-answering keyword-extraction datapreprocessing allennlp bert-model gpt-2 pka roberta-model t5-model distractors

Updated Nov 25, 2021
Python

SubeyteT / Supermarket_App_BI-Analytics

Star

LGBM and logistic regression for prediction of customers' second time transaction for an online market app.

encoding eda feature-selection feature-extraction logistic-regression feature-engineering target-detection datapreprocessing lgbm

Updated Oct 3, 2021
Python

RodrigoSdeCarvalho / pyEasyML

Star

Python version of my machine learning framework that provides data preprocessing, feature selection, classification, regression and even more complex deep learning models, model persistence, autoencoders and anomaly detection

data machine-learning feature-selection autoencoder anomaly-detection datapreprocessing

Updated Jan 15, 2024
Python

divithraju / divith-aju-Hadoop-Pyspark-pipeline

Star

This project demonstrates the creation of a scalable data processing pipeline for handling and analyzing log data from a hypothetical e-commerce platform. Leveraging Hadoop and PySpark, the pipeline is designed to process large volumes of log files, providing meaningful insights into user behavior, system performance, and sales metrics.

client documentation data database apache-spark pipeline bigdata project python3 pyspark hdfs software-engineering ecommerce-platform dataengineering datapreprocessing apache-hadoop-framework project-repository dataingestionframework

Updated Aug 17, 2024
Python

AntoinePinto / StringPairFinder

Star

Algorithm designed to match strings by similarity

datascience fuzzywuzzy fuzzymatching datapreprocessing stringmatching stringpairfinder datamatching

Updated Jan 26, 2025
Python

Kawai-Senpai / UltraClean

Star

UltraClean is a fast and efficient Python library for cleaning and preprocessing text data for AI/ML tasks and data processing.

data-science aiml dataset cleaner datapreprocessing spamdetection

Updated Dec 30, 2024
Python

pavankethavath / Car_dekho_car_price_prediction

Star

A Streamlit web app utilizing Python, scikit-learn, and pandas for used car price prediction. Features data preprocessing (scaling, encoding), Random Forest model optimization with GridSearchCV, and interactive user input handling. Achieves high accuracy (R² score: 0.9028), showcasing skills in machine learning, data engineering, and deployment.

Updated Nov 27, 2024
Python

krishna-chandel / Gesture-based-HCI-ML-Project

Star

Teaching computers to understand sign language! This project uses image processing to recognize hand signs, making technology more inclusive and accessible.

opencv random-forest svm knn kmeans-clustering hierarchical-clustering signlanguage datapreprocessing handgesture-recognition signlanguagerecognition

Updated Apr 15, 2024
Python

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datapreprocessing

Here are 87 public repositories matching this topic...

akarazniewicz / cocosplit

irenekarijadi / CEEMDAN-EWT-LSTM

munsif200 / Transformer-Network-for-Short-term-Photovoltaic-Power-Generation-Forecasting

cereja-project / cereja

ChenTaHung / Monotonic-Optimal-Binning

yakupkaplan / Home-Credit-Default-Risk

Karan-Malik / prepdata

KaramiMostafa / MachineLearningInHealthcare

MohammedSaim-Quadri / Intrusion_Detection-System

autolordz / docx-content-modify

ENGRZULQARNAIN / ScrapySub

cybergeekgyan / ResumeScreening

FawziElNaggar / Question-MCQ-_Answer_Generation

SubeyteT / Supermarket_App_BI-Analytics

RodrigoSdeCarvalho / pyEasyML

divithraju / divith-aju-Hadoop-Pyspark-pipeline

AntoinePinto / StringPairFinder

Kawai-Senpai / UltraClean

pavankethavath / Car_dekho_car_price_prediction

krishna-chandel / Gesture-based-HCI-ML-Project

Improve this page

Add this topic to your repo