Data Scientist | Business Transformation & Analytics
Data-driven professional transitioning into data science with 20+ years of experience in digital transformation, operations, and strategic decision-making. Proficient in Python, SQL, data science and machine learning, with a strong ability to translate business challenges into data-driven solutions. Passionate about leveraging analytics, automation, and Al to optimize business processes and drive innovation.
Smart media organizer using deep learning, metadata, and hashing for automated classification and cleanup.
This tool helps you structure a photo/video archive by recursively scanning folders, extracting date and event info, classifying images using ResNet-50, and identifying exact and perceptual duplicates (images and videos).
After consolidating all the media files, I ended with 780 GiB of data in more than 90_000 individual files. It will take ages to organize this data system by hand
Python, PyTorch, Transforms, TIMM, OpenCV, PIL, CNNs, Residual Networks, ImageHash, EXIFRead, PyHEIF, JSON
/destination/ ├── 2024- │ └── 03-10/ │ └── Birthday/ │ ├── pict00001_dog.jpg │ └── vid00001_party.mp4 ├── duplicated/ ├── near_duplicated/ └── noevaluate/
Files renamed as:
pict00001_dog.jpg
vid00005_sunset.mp4