Jump to content

AnujSaharan

Member
  • Posts

    16
  • Joined

  • Last visited

Awards

This user doesn't have any awards

3 Followers

About AnujSaharan

  • Birthday September 14

Contact Methods

  • Twitter
    theAnujSaharan

Profile Information

  • Gender
    Male
  • Location
    Seattle, WA
  • Biography
    Orange Vanilla Coke Enthusiast
  • Occupation
    Big Tech Bro

AnujSaharan's Achievements

  1. Disney made money from it and publicly distributed the likeness for monetary gains. I would be in the wrong if I were publicly sharing checkpoints and inference scripts myself for someone else's voice - I fully agree with you. I have no plans to do that. Asked for permission above - if unacceptable, happy to stop posting the little snippets.
  2. Seems to be find on the tempo now, thanks for that callout. First line from the new video to test - "It looks like a children's toy but it's actually one of the most versatile hacking tools to ever hit the market. And if you've been on TikTok in the last six months, there's a good chance you've seen people using it to change gas station signs, set off department store PA systems and open up Tesla charging ports." linus_test_2.wav The base model is meant to be more 'conversational' than presenter voice or whatever it may be, and that's what's reflected here - of course WAN show being the fine-tune data is also conversation and unscripted audio - and therefore the model seems to be making choices around where to take breaths, pauses etc. (it adds uhms and ahhs even though its not explicitly in the sentence it should be generating) - which is obviously uncharacteristic for edited audio like on all the videos the channel - so the comparison isn't apples to apples. WAN show conversation isn't as high energy and tempo etc - that being said, it did just learn to speed it up and sample better with a little bit more training. I can try making the fine-tune dataset more diverse later for better results.
  3. Of course, I am happy to stop experimenting and building if he doesn't approve or is uncomfortable - very obviously haven't posted or distributed the model itself or any inferencing scripts for privacy reasons. Will let @nicklmg or someone from the team make that call. If this gets good enough - happy to even share the model for video editing voiceovers, dubbing into foreign languages, and whatever else use case. Although I will say - similar technology is out there on the web - anyone can take a small snippet and try zero-shot cloning on elevenlabs or something like that - irrespective of whether it is a good result in the end or not - and that'd be from fully anonymous sources cloning using fully untraceable models that live behind at least an LLC level protection. Taking offence on someone else's behalf on the output of a model is a conversation that goes far beyond just this forum thread - and applicable to gpt, dall-e, stable diffusion etc etc (all of which are in the open domain and easily accessible to everyone) - and I am happy to take the direction of wherever that public discourse goes.
  4. I've been training my own GAN-based TTS and now diffusion-based TTS models for quite a bit (the eventual goal is to have a 'teacher' model teach a cloned voice how to sing and rap fwiw). I've seen the guys try a couple different models zero-shot to try and clone their voices - hasn't quite hit ever, so trying to fix that. Here's a super early preliminary attempt on a ~500m parameter TTS model fine-tuned with Linus' voice from the most recent WAN show. Only fine-tuned it for ~15 minutes on a single 3080, very undertrained obviously, can probably get much better with more time. Just making a thread to track progress until it sings. Novel text from a random The Verge review to test Linus' voice against: Generated Audio: linus test.wav The model is autoregressive a la GPT-2 and Tortoise. So based on speech and words it's seen before - it may choose to change the emotional tone, add pauses, different words etc based on the training data and preceding text while generating - for example it added "i.e." and uhhs and umms near the end of the clip on its own - I straight copy pasted the highlighted text above. Rate on a scale of 1-10 in its current state? --- If you're interested in TTS btw - I post some experiments on my Twitter - (Anuj Saharan (@theAnujSaharan) / Twitter).
  5. Thanks for the suggestions everyone. I guess I'll just try the Bitfenix Shadow then.
  6. Has anyone ever used the Corsair Carbide Spec 01 case? I was thinking of moving on from my generic ultra cheap-o case to a still pretty cheap but much more awesome [i presume?] case, and this is what I found. It's available for about 60 US$ right now here in India. http://www.corsair.com/en/carbide-series-spec-01-red-led-mid-tower-gaming-case
  7. I don't know, I don't really care for 4k content just yet on YouTube, because nor do I have a 4k display or a fast enough internet connection. And I don't mean to sound like a rude guy but, 4k for reviews and tech stuff just doesn't make sense to me. 1080p is enough.
  8. I am mainly impressed by the power efficiency on the 900 series cards. I've seen people running GTX 970s on the same PSU that I have which isn't exactly the best one out there [Corsair VS650]. Not sure about the 980 though.. Aaand since I prefer Zotac [Running a GTX650 Ti Boost right now], there isn't any reason for me to not like it.
  9. Even though a rumoured sequel is already in the works and also hinted in the first game itself, a Watch_Dogs sequel would be pretty awesome!
  10. I did take a look at the M series. I do like it but it does cost a bit more than the ATH series. I will try and get the M30x though. Thanks for the suggestion.
  11. Aaand I am late too. I did have a great day! Thanks!
  12. Wassup peeps! So it was my birthday yesterday and I have been having trouble choosing a new pair of headphones for myself. I prefer the ones that are over the ear [Circumaurual] and not on the ear [superAurual] and they should be comfortable to wear [Obviously]. So I have made up my mind on the Audio Technica ATH-AX1is. They are pretty affordable for me and look like a pretty near pair of headphones. Has anyone ever used or using them? Not an audiophile here so are they good enough? Aaaand if you do know of a better set of headphones around the same price range, please let me know. I currently have Sennheiser HD180, so it definitely seems like an improvement.
×