Jump to content
Search In
  • More options...
Find results that contain...
Find results in...

Python - Cutting out parts of a video using face detection (quick and dirty)

@MS-DOSposted this topic a couple of days ago:

I was pretty sure this could be achieved with python using a couple of libraries... so I did it.

 

Here is the script:

Spoiler

import face_recognition
import cv2
import numpy as np
import ffmpeg
import argparse
import os
import shutil

parser = argparse.ArgumentParser()
parser.add_argument("--video", type=str, help="Input video")
parser.add_argument("--face", type=str, help="Input face")
parser.add_argument("--name", type=str, default="target", help="Input name")
parser.add_argument("--scale", type=float, default=1, help="Video scaling for processing")
parser.add_argument("--framerate", type=float, default=29.97, help="Video framerate")
parser.add_argument("--frameskip", type=int, default=30, help="Every how many frames the script should process")
parser.add_argument("--output", type=str, default="out.mp4", help="Output file name")
args = parser.parse_args()

video_capture = cv2.VideoCapture(args.video)

# Load a sample picture and learn how to recognize it.
target_image = face_recognition.load_image_file(args.face)
target_face_encoding = face_recognition.face_encodings(target_image)[0]

# Create arrays of known face encodings and their names
known_face_encodings = [
    target_face_encoding,
]
known_face_names = [
    args.name
]

segments = [{'start': 0, 'end': 0}]
segment_count = 0
last_frame_saved = 0
count = 1
framerate = args.framerate
scale = args.scale

# Initialize some variables
face_locations = []
face_encodings = []
face_names = []
process_this_frame = True

while True:
    # Grab a single frame of video
    ret, frame = video_capture.read()
    if not ret:
        break

    # Resize frame of video for faster face recognition processing
    small_frame = cv2.resize(frame, (0, 0), fx=scale, fy=scale)

    # Convert the image from BGR color (which OpenCV uses) to RGB color (which face_recognition uses)
    rgb_small_frame = small_frame[:, :, ::-1]

    # Only process every other frame of video to save time
    if count % args.frameskip == 0:
        # Find all the faces and face encodings in the current frame of video
        face_locations = face_recognition.face_locations(rgb_small_frame)
        face_encodings = face_recognition.face_encodings(rgb_small_frame, face_locations)

        face_names = []
        for face_encoding in face_encodings:
            # See if the face is a match for the known face(s)
            matches = face_recognition.compare_faces(known_face_encodings, face_encoding)
            name = "Unknown"

            # Or instead, use the known face with the smallest distance to the new face
            face_distances = face_recognition.face_distance(known_face_encodings, face_encoding)
            best_match_index = np.argmin(face_distances)
            if matches[best_match_index]:
                name = known_face_names[best_match_index]

            face_names.append(name)

    # Display the resulting image
    if args.name in face_names:
        if count == last_frame_saved + 1:
            segments[segment_count]['end'] = count
        else:
            segments.append({'start': count, 'end': count})
            segment_count = segment_count + 1
        last_frame_saved = count
        print(segments)
        cv2.imshow('Video3', frame)
    
    count = count + 1

    # Hit 'q' on the keyboard to quit!
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
video_capture.release()
cv2.destroyAllWindows()

os.mkdir("tmp")
inputs = []
for segment in segments:
    segment['start'] = segment['start'] / framerate
    segment['end'] = segment['end'] / framerate
    length = segment['end']-segment['start']
    if length > 0.1:
        ffmpeg.input(
            args.video,
            ss=segment['start'],
            t=length
        ).output(
            "tmp/" + str(segment['start']) + ".mp4"
        ).overwrite_output().run()
        print(ffmpeg.probe("tmp/" + str(segment['start']) + ".mp4"))
        inputs.append(ffmpeg.input("tmp/" + str(segment['start']) + ".mp4").video)
        inputs.append(ffmpeg.input("tmp/" + str(segment['start']) + ".mp4").audio)

ffmpeg.concat(
    *inputs, v=1, a=1
).output(args.output).overwrite_output().run()
shutil.rmtree("tmp")

 

It accepts a video and an image of the person you want to isolate and outputs a video file containing only the parts where that person is on screen.

 

To run it you'll need to install ffmpeg, python 3.8 and the following python packages using pip:

pip3.8 install dlib face_recognition opencv-python argparse ffmpeg-python

 

then pass the command line arguments, e.g.

py.exe -3.8 .\autocut.py --video .\bvt.mp4 --face .\b.PNG --name biden --frameskip 30 --scale 1

 

As a proof of concept I took this video with highlights from the recent US presidential debates:

and fed this screencap of Biden to the script:

Spoiler

b.PNG.e3894bf47588c683947f17ce04a651c8.PNG

and this was the result:

output.gif.74dca950e78313c8667625472498a8d6.gif

 

(this is just a gif for demo purposes, the script produces an mp4 file with the same quality as the input video)

 

Face recognition is pretty slow on a high resolution video (because of the linearity of the task it would be pretty challenging to parallelize this) so there's the option to skip frames and only check every so often if the person is still in the frame. This is controlled via the frameskip parameter (a value of 30 means that one frame every 30 is checked, meaning roughly 1 second in this case). Skipping frames means the cuts won't be as accurate, for instance you can see the moderator for a few frames. There is also an option to scale down the frames to less than 1 to speed up processing but bear in mind this lowers the accuracy of the face detection algorithm.

 

It's possible this would be faster using a higher performance language like C++ but it would take longer to get working whereas this is just a few dozen lines long and runs on everything.

 

There are some edge cases where it doesn't work well due to inherent limitations of face recognition, e.g. when the person is turned as in this frame:

Spoiler

image.png.7c160c36c924cd74442e94abb5b3cba8.png

however, for a quick montage from, say, an interview where people always stare at the camera it shouldn't be a problem.

 

With that said it's probably just as fast and less error prone to just do this manually.

 

I hope this is useful to somebody ;) if not, at least it was interesting for me.

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

What is scaling and how does it work? Asus PB287Q unboxing! Console alternatives :D Watch Netflix with Kodi on Arch Linux Sharing folders over the internet using SSH Beginner's Guide To LTT (by iamdarkyoshi)

Sauron'stm Product Scores:

Spoiler

Just a list of my personal scores for some products, in no particular order, with brief comments. I just got the idea to do them so they aren't many for now :)

Don't take these as complete reviews or final truths - they are just my personal impressions on products I may or may not have used, summed up in a couple of sentences and a rough score. All scores take into account the unit's price and time of release, heavily so, therefore don't expect absolute performance to be reflected here.

 

-Lenovo Thinkpad X220 - [8/10]

Spoiler

A durable and reliable machine that is relatively lightweight, has all the hardware it needs to never feel sluggish and has a great IPS matte screen. Downsides are mostly due to its age, most notably the screen resolution of 1366x768 and usb 2.0 ports.

 

-Apple Macbook (2015) - [Garbage -/10]

Spoiler

From my perspective, this product has no redeeming factors given its price and the competition. It is underpowered, overpriced, impractical due to its single port and is made redundant even by Apple's own iPad pro line.

 

-OnePlus X - [7/10]

Spoiler

A good phone for the price. It does everything I (and most people) need without being sluggish and has no particularly bad flaws. The lack of recent software updates and relatively barebones feature kit (most notably the lack of 5GHz wifi, biometric sensors and backlight for the capacitive buttons) prevent it from being exceptional.

 

-Microsoft Surface Book 2 - [Garbage - -/10]

Spoiler

Overpriced and rushed, offers nothing notable compared to the competition, doesn't come with an adequate charger despite the premium price. Worse than the Macbook for not even offering the small plus sides of having macOS. Buy a Razer Blade if you want high performance in a (relatively) light package.

 

-Intel Core i7 2600/k - [9/10]

Spoiler

Quite possibly Intel's best product launch ever. It had all the bleeding edge features of the time, it came with a very significant performance improvement over its predecessor and it had a soldered heatspreader, allowing for efficient cooling and great overclocking. Even the "locked" version could be overclocked through the multiplier within (quite reasonable) limits.

 

-Apple iPad Pro - [5/10]

Spoiler

A pretty good product, sunk by its price (plus the extra cost of the physical keyboard and the pencil). Buy it if you don't mind the Apple tax and are looking for a very light office machine with an excellent digitizer. Particularly good for rich students. Bad for cheap tinkerers like myself.

 

 

Link to post
Share on other sites

It's cool that you made this and maybe OP from that thread will find it somewhat helpful but I think what they were asking for was a way to isolate one person talking in a group of people.

 

That means people talking over each other not one at a time like in a debate.

Guides & Tutorials:

How To: Remotely Access a Computer, Server, or NAS

How To: Access Remote Systems at Home/Work Securely from Anywhere with Pritunl

How to Format Storage Devices in Windows 10

A How-To: Drive Sharing in Windows 10

VFIO GPU Pass-though w/ Looking Glass KVM on Ubuntu 19.04

A How-To Guide: Building a Rudimentary Disk Enclosure

Three Methods to Resetting a Windows Login Password

A Beginners Guide to Debian CLI Based File Servers

 

Guide/Tutorial in Progress:

How to Use Memtest86 to Diagnose RAM Errors

 

In the Queue:

iPXE Network Booting to an iSCSI Target

 

Don't see what you need? Check the Full List or *PM me, if I haven't made it I'll add it to the list.

*NOTE: I'll only add it to the list if the request is something I know I can do.

Link to post
Share on other sites

Last I heard nvidia contributed a module to openCV for GPU face recognition. According to them there was ~6x perf increase over cpu based approaches.

CPU: Intel i7 - 5820k @ 4.5GHz, Cooler: Corsair H80i, Motherboard: MSI X99S Gaming 7, RAM: Corsair Vengeance LPX 32GB DDR4 2666MHz CL16,

GPU: ASUS GTX 980 Strix, Case: Corsair 900D, PSU: Corsair AX860i 860W, Keyboard: Logitech G19, Mouse: Corsair M95, Storage: Intel 730 Series 480GB SSD, WD 1.5TB Black

Display: BenQ XL2730Z 2560x1440 144Hz

Link to post
Share on other sites
2 minutes ago, Windows7ge said:

It's cool that you made this and maybe OP from that thread will find it somewhat helpful but I think what they were asking for was a way to isolate one person talking in a group of people.

 

That means people talking over each other not one at a time like in a debate.

I'm not sure about that, OP didn't mention audio at all... that would definitely be more complex

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

What is scaling and how does it work? Asus PB287Q unboxing! Console alternatives :D Watch Netflix with Kodi on Arch Linux Sharing folders over the internet using SSH Beginner's Guide To LTT (by iamdarkyoshi)

Sauron'stm Product Scores:

Spoiler

Just a list of my personal scores for some products, in no particular order, with brief comments. I just got the idea to do them so they aren't many for now :)

Don't take these as complete reviews or final truths - they are just my personal impressions on products I may or may not have used, summed up in a couple of sentences and a rough score. All scores take into account the unit's price and time of release, heavily so, therefore don't expect absolute performance to be reflected here.

 

-Lenovo Thinkpad X220 - [8/10]

Spoiler

A durable and reliable machine that is relatively lightweight, has all the hardware it needs to never feel sluggish and has a great IPS matte screen. Downsides are mostly due to its age, most notably the screen resolution of 1366x768 and usb 2.0 ports.

 

-Apple Macbook (2015) - [Garbage -/10]

Spoiler

From my perspective, this product has no redeeming factors given its price and the competition. It is underpowered, overpriced, impractical due to its single port and is made redundant even by Apple's own iPad pro line.

 

-OnePlus X - [7/10]

Spoiler

A good phone for the price. It does everything I (and most people) need without being sluggish and has no particularly bad flaws. The lack of recent software updates and relatively barebones feature kit (most notably the lack of 5GHz wifi, biometric sensors and backlight for the capacitive buttons) prevent it from being exceptional.

 

-Microsoft Surface Book 2 - [Garbage - -/10]

Spoiler

Overpriced and rushed, offers nothing notable compared to the competition, doesn't come with an adequate charger despite the premium price. Worse than the Macbook for not even offering the small plus sides of having macOS. Buy a Razer Blade if you want high performance in a (relatively) light package.

 

-Intel Core i7 2600/k - [9/10]

Spoiler

Quite possibly Intel's best product launch ever. It had all the bleeding edge features of the time, it came with a very significant performance improvement over its predecessor and it had a soldered heatspreader, allowing for efficient cooling and great overclocking. Even the "locked" version could be overclocked through the multiplier within (quite reasonable) limits.

 

-Apple iPad Pro - [5/10]

Spoiler

A pretty good product, sunk by its price (plus the extra cost of the physical keyboard and the pencil). Buy it if you don't mind the Apple tax and are looking for a very light office machine with an excellent digitizer. Particularly good for rich students. Bad for cheap tinkerers like myself.

 

 

Link to post
Share on other sites
2 minutes ago, Sauron said:

I'm not sure about that, OP didn't mention audio at all... that would definitely be more complex

Quote

Lets say you have a TV program where people talk and you have like 6 people, and you want to make a video consisting of only one of the people and cut the rest.

Maybe he can clarify this a little bit more for us then because that's how I interpreted it. Depending on what kind of show he's talking about if anyone talks over anyone else the file would have to be chopped to exclude those parts unless their voice can be isolate from everyone else's.

Guides & Tutorials:

How To: Remotely Access a Computer, Server, or NAS

How To: Access Remote Systems at Home/Work Securely from Anywhere with Pritunl

How to Format Storage Devices in Windows 10

A How-To: Drive Sharing in Windows 10

VFIO GPU Pass-though w/ Looking Glass KVM on Ubuntu 19.04

A How-To Guide: Building a Rudimentary Disk Enclosure

Three Methods to Resetting a Windows Login Password

A Beginners Guide to Debian CLI Based File Servers

 

Guide/Tutorial in Progress:

How to Use Memtest86 to Diagnose RAM Errors

 

In the Queue:

iPXE Network Booting to an iSCSI Target

 

Don't see what you need? Check the Full List or *PM me, if I haven't made it I'll add it to the list.

*NOTE: I'll only add it to the list if the request is something I know I can do.

Link to post
Share on other sites

Actually this is a lot older than I thought. It was a keynote at GTC 2011 using viola-jones algorithm. But iirc all the new algorithms are learning based anyways so using tensor cores for deep learning would probably be the next step.

CPU: Intel i7 - 5820k @ 4.5GHz, Cooler: Corsair H80i, Motherboard: MSI X99S Gaming 7, RAM: Corsair Vengeance LPX 32GB DDR4 2666MHz CL16,

GPU: ASUS GTX 980 Strix, Case: Corsair 900D, PSU: Corsair AX860i 860W, Keyboard: Logitech G19, Mouse: Corsair M95, Storage: Intel 730 Series 480GB SSD, WD 1.5TB Black

Display: BenQ XL2730Z 2560x1440 144Hz

Link to post
Share on other sites
6 minutes ago, trag1c said:

Actually this is a lot older than I thought. It was a keynote at GTC 2011 using viola-jones algorithm. But iirc all the new algorithms are learning based anyways so using tensor cores for deep learning would probably be the next step.

I don't have an nvidia card around to test that, the recognition part of the script only consists of a couple of lines so it should be pretty simple to do a drop in replacement.

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

What is scaling and how does it work? Asus PB287Q unboxing! Console alternatives :D Watch Netflix with Kodi on Arch Linux Sharing folders over the internet using SSH Beginner's Guide To LTT (by iamdarkyoshi)

Sauron'stm Product Scores:

Spoiler

Just a list of my personal scores for some products, in no particular order, with brief comments. I just got the idea to do them so they aren't many for now :)

Don't take these as complete reviews or final truths - they are just my personal impressions on products I may or may not have used, summed up in a couple of sentences and a rough score. All scores take into account the unit's price and time of release, heavily so, therefore don't expect absolute performance to be reflected here.

 

-Lenovo Thinkpad X220 - [8/10]

Spoiler

A durable and reliable machine that is relatively lightweight, has all the hardware it needs to never feel sluggish and has a great IPS matte screen. Downsides are mostly due to its age, most notably the screen resolution of 1366x768 and usb 2.0 ports.

 

-Apple Macbook (2015) - [Garbage -/10]

Spoiler

From my perspective, this product has no redeeming factors given its price and the competition. It is underpowered, overpriced, impractical due to its single port and is made redundant even by Apple's own iPad pro line.

 

-OnePlus X - [7/10]

Spoiler

A good phone for the price. It does everything I (and most people) need without being sluggish and has no particularly bad flaws. The lack of recent software updates and relatively barebones feature kit (most notably the lack of 5GHz wifi, biometric sensors and backlight for the capacitive buttons) prevent it from being exceptional.

 

-Microsoft Surface Book 2 - [Garbage - -/10]

Spoiler

Overpriced and rushed, offers nothing notable compared to the competition, doesn't come with an adequate charger despite the premium price. Worse than the Macbook for not even offering the small plus sides of having macOS. Buy a Razer Blade if you want high performance in a (relatively) light package.

 

-Intel Core i7 2600/k - [9/10]

Spoiler

Quite possibly Intel's best product launch ever. It had all the bleeding edge features of the time, it came with a very significant performance improvement over its predecessor and it had a soldered heatspreader, allowing for efficient cooling and great overclocking. Even the "locked" version could be overclocked through the multiplier within (quite reasonable) limits.

 

-Apple iPad Pro - [5/10]

Spoiler

A pretty good product, sunk by its price (plus the extra cost of the physical keyboard and the pencil). Buy it if you don't mind the Apple tax and are looking for a very light office machine with an excellent digitizer. Particularly good for rich students. Bad for cheap tinkerers like myself.

 

 

Link to post
Share on other sites
14 minutes ago, Sauron said:

I don't have an nvidia card around to test that, the recognition part of the script only consists of a couple of lines so it should be pretty simple to do a drop in replacement.

I am kinda curios now so if I get time today I might try to whip together a demo that performs the same task as your script.

CPU: Intel i7 - 5820k @ 4.5GHz, Cooler: Corsair H80i, Motherboard: MSI X99S Gaming 7, RAM: Corsair Vengeance LPX 32GB DDR4 2666MHz CL16,

GPU: ASUS GTX 980 Strix, Case: Corsair 900D, PSU: Corsair AX860i 860W, Keyboard: Logitech G19, Mouse: Corsair M95, Storage: Intel 730 Series 480GB SSD, WD 1.5TB Black

Display: BenQ XL2730Z 2560x1440 144Hz

Link to post
Share on other sites

smh shoulda embeded the code with repl.it

 

sincerely,

definitely not a repl.it employee

Link to post
Share on other sites

Curious, what if you had the script use ffmpeg to bring the quality of the video down to something lower (lets say 360p), have it analyze that footage, but apply the changes to the original video. Theoretically should go a lot faster with the same quality output.

Link to post
Share on other sites
5 minutes ago, pierom_qwerty said:

Curious, what if you had the script use ffmpeg to bring the quality of the video down to something lower (lets say 360p), have it analyze that footage, but apply the changes to the original video. Theoretically should go a lot faster with the same quality output.

That's what the scale parameter does (I scale it using opencv), unfortunately the accuracy isn't quite the same. Of course it depends on the video - in this case there are a few shots where the person is pretty far from the camera so lowering the resolution hurts the output.

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

What is scaling and how does it work? Asus PB287Q unboxing! Console alternatives :D Watch Netflix with Kodi on Arch Linux Sharing folders over the internet using SSH Beginner's Guide To LTT (by iamdarkyoshi)

Sauron'stm Product Scores:

Spoiler

Just a list of my personal scores for some products, in no particular order, with brief comments. I just got the idea to do them so they aren't many for now :)

Don't take these as complete reviews or final truths - they are just my personal impressions on products I may or may not have used, summed up in a couple of sentences and a rough score. All scores take into account the unit's price and time of release, heavily so, therefore don't expect absolute performance to be reflected here.

 

-Lenovo Thinkpad X220 - [8/10]

Spoiler

A durable and reliable machine that is relatively lightweight, has all the hardware it needs to never feel sluggish and has a great IPS matte screen. Downsides are mostly due to its age, most notably the screen resolution of 1366x768 and usb 2.0 ports.

 

-Apple Macbook (2015) - [Garbage -/10]

Spoiler

From my perspective, this product has no redeeming factors given its price and the competition. It is underpowered, overpriced, impractical due to its single port and is made redundant even by Apple's own iPad pro line.

 

-OnePlus X - [7/10]

Spoiler

A good phone for the price. It does everything I (and most people) need without being sluggish and has no particularly bad flaws. The lack of recent software updates and relatively barebones feature kit (most notably the lack of 5GHz wifi, biometric sensors and backlight for the capacitive buttons) prevent it from being exceptional.

 

-Microsoft Surface Book 2 - [Garbage - -/10]

Spoiler

Overpriced and rushed, offers nothing notable compared to the competition, doesn't come with an adequate charger despite the premium price. Worse than the Macbook for not even offering the small plus sides of having macOS. Buy a Razer Blade if you want high performance in a (relatively) light package.

 

-Intel Core i7 2600/k - [9/10]

Spoiler

Quite possibly Intel's best product launch ever. It had all the bleeding edge features of the time, it came with a very significant performance improvement over its predecessor and it had a soldered heatspreader, allowing for efficient cooling and great overclocking. Even the "locked" version could be overclocked through the multiplier within (quite reasonable) limits.

 

-Apple iPad Pro - [5/10]

Spoiler

A pretty good product, sunk by its price (plus the extra cost of the physical keyboard and the pencil). Buy it if you don't mind the Apple tax and are looking for a very light office machine with an excellent digitizer. Particularly good for rich students. Bad for cheap tinkerers like myself.

 

 

Link to post
Share on other sites
1 hour ago, Sauron said:

I'm not sure about that, OP didn't mention audio at all... that would definitely be more complex

yeah, and as I understood it, or at least what I thought would be very challenging is to cut out, ie "isolate" only one person, like you can do in picture editing programs (I think even 3d paint can do that) and I'm pretty sure there are programs to do this with movies too, so you could basically replace one person with another, also known as "deepfake". 

And as such whatever the op of the other thread actually wanted, it should definitely be possible (since you can cut out whatever you want and leave only what you want anyway) 

 

 

I don't really know what programs would be used for that tho, and what are the requirements for them to run... 

 

Still cool you did that with the python program you made, even though the purpose remains unclear. ;)

 

 

RYZEN 5 3600 | GIGABYTE 3070 VISION OC | 16GB CORSAIR VENGEANCE LPX 3200 DDR4 | MSI B350M MORTAR | 250GB SAMSUNG EVO 860 | 4TB TOSHIBA X 300 | 1TB TOSHIBA SSHD | 120GB KINGSTON SSD | WINDOWS 10 PRO | INWIN 301| BEQUIET PURE POWER 10 500W 80+ SILVER | ASUS 279H | LOGITECH Z906 | DELL KB216T | LOGITECH M185 | SONY DUALSHOCK 4

 

LENOVO IDEAPAD 510 | i5 7200U | 8GB DDR4 | NVIDIA GEFORCE 940MX | 1TB WD | WINDOWS 10 GO HOME 

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×