Find a band’s musical influences using Python – Pearl Jam and Dr. Dre

I created a fun little program that will go and find the musical influences of any artist. It uses the Spotify APIs and a bit of logic.

The logic works like this:

  1. Find the artist on Spotify
  2. Go and find the earliest album that artist made.
  3. Get the related artists and their first album date
  4. Pick the related artist that has the closest album to 2 years before the original artist
  5. Start back up with #3
  6. Repeat until there are no artists left. Or until the program goes way off the rails.

Note: If there are no related artists with albums older than the oldest it will go back up the chain and try and find another route.

Source Code is on Github here.

Example #1 – Pearl Jam

Peal Jam Demo Mookie
This has some pretty great results. Starting with Pearl Jam’s first album Ten (1991), It goes all the way back to 1953 with decent results like Wire – Pink Flag, The Yardbirds – London 1963.
Continue reading “Find a band’s musical influences using Python – Pearl Jam and Dr. Dre”

Python: Pickle and Unpickle Tree Classifier with Hashing Vectorizer

pickle and python

I took this piece of code out of a project I am working on. I wanted to guess the tag based on keywords in the body of text. So, I take the text, apply a hash vectorizer and then pass the hashed values into a AdaBoostClassifier that uses DecisionTreeClassifier. I wanted to build it once and use it over and over again, so I used Pickle to save it on the file system to reuse.

This code assumes you have a dataframe populated already.

Includes:

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import HashingVectorizer
import pandas as pd
from sklearn.cross_validation import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import AdaBoostClassifier
import pickle
import os.path

Setting up filesystem and parameters stuff:

resetPickle = False
foundPickle = False
"""This is where you would load the dataset"""
df_tags = pd.DataFrame()
pick_model_path1='pickles/modelAdaDecTreeClassifier.pickle'
pick_model_tags_root_pre = 'pickles/model_tag_'
pick_model_tags_root_post = '_DecTreeClassifier.pickle'
tag_pickle_path = pick_model_tags_root_pre + 'PIC' + pick_model_tags_root_post

Create HashingVectorizer. The ngrams 1,2 means that it will use words like “Richmond” and “Richmond VA” as tokens:

vctrizr_tag = HashingVectorizer(ngram_range=(1, 2))

This will check to see if the pickle exists. It will load it into the model if it exists:

if resetPickle == False and os.path.isfile(tag_pickle_path):
    pickle_in = open(tag_pickle_path,'rb')
    model_tag = pickle.load(pickle_in)
    foundPickle = True

If the pickle does not exist, it will go and train the AdaBoostClassifier and save it into the pickle:

if foundPickle == False:
    y_tag = df_tags
    vctr_tag= vctrizr_tag.transform(df_tags['Text'])
    X_tag = vctrizr_tag.transform(df_tags['Text'])
    X_train_tag, X_test_tag, y_train_tag, y_test_tag = train_test_split(X_tag, y_tag, test_size=0.2, random_state=1)
    model_tag = AdaBoostClassifier(DecisionTreeClassifier(max_depth=44),n_estimators=25)
    model_tag = model_tag.fit(X_train_tag, y_train_tag)
    score = model_tag.score(X_test_tag, y_test_tag)
    print('score',score)
    with open(tag_pickle_path, 'wb') as f:
        pickle.dump(model_tag, f)

All together now:

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import HashingVectorizer
import pandas as pd
from sklearn.cross_validation import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import AdaBoostClassifier
import pickle
import os.path
resetPickle = False
foundPickle = False
"""This is where you would load the dataset"""
df_tags = pd.DataFrame()
 
pick_model_path1='pickles/modelAdaDecTreeClassifier.pickle'
pick_model_tags_root_pre = 'pickles/model_tag_'
pick_model_tags_root_post = '_DecTreeClassifier.pickle'
tag_pickle_path = pick_model_tags_root_pre + 'PIC' + pick_model_tags_root_post
vctrizr_tag = HashingVectorizer(ngram_range=(1, 2))
if resetPickle == False and os.path.isfile(tag_pickle_path):
    pickle_in = open(tag_pickle_path,'rb')
    model_tag = pickle.load(pickle_in)
    foundPickle = True
if foundPickle == False:
    y_tag = df_tags
    vctr_tag= vctrizr_tag.transform(df_tags['Text'])
    X_tag = vctrizr_tag.transform(df_tags['Text'])
    X_train_tag, X_test_tag, y_train_tag, y_test_tag = train_test_split(X_tag, y_tag, test_size=0.2, random_state=1)
    model_tag = AdaBoostClassifier(DecisionTreeClassifier(max_depth=44),n_estimators=25)
    model_tag = model_tag.fit(X_train_tag, y_train_tag)
    score = model_tag.score(X_test_tag, y_test_tag)
    print('score',score)
    with open(tag_pickle_path, 'wb') as f:
        pickle.dump(model_tag, f)

IPMap Python Ip Address Locator Command Line Script

ipmap python ip location geocode

This program uses this site IpMap to get peoples location based off of their IP address.

It’s written in python. Enjoy.

Download SourceGPLv3 Code. Give back.
Usage:
python ipmap.py 74.125.45.100 all
python ipmap.py 74.125.45.100
python ipmap.py (This will get you the help screen)

Args:
all = Prints all details
nomap = Gets All, no map
loc = Gets: Country, Region, City
Continue reading “IPMap Python Ip Address Locator Command Line Script”

Python: Grab Email from Gmail and Insert into MySql Database

This script will:

  • Log into Gmail Pop
  • Read the email
  • Delete the read email
  • Insert the email’s text into a MySql database
  • Sleep for 1800 seconds, and repeat
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
try:
        import poplib, sys, time
        import string, random
        import StringIO, rfc822
        import datetime
        SERVER = "pop.gmail.com"
        USER  = "gmailusername"
        PASSWORD = "gmailpassword"
        i = 0;
        print """
        |------------------------------------------|
        |  This is a python program that checks a  |
        |  POP account and if there is a message,  |
        |  it adds it to the SQL server.           |
        |------------------------------------------|
                  by: Daniel Folkes
                         email: [email protected]
 
        (every 180 seconds)
        Checking POP server....
"""
        while 1:
                try:
                        server = poplib.POP3_SSL(SERVER, 995)
                        server.user(USER)
                        server.pass_(PASSWORD)
                except:
                        print "error setting up server."
 
 
                resp, items, octets = server.list()
                # download a random message
                try:
                        id, size = string.split(items[0])
                        resp, text, octets = server.retr(id)
 
                        text = string.join(text, "\n")
                        file = StringIO.StringIO(text)
                        note = ""
                        name = ""
                        message = rfc822.Message(file)
                        for k, v in message.items():
                                if k=='from':
                                                name = v[:12]
                        note = message.fp.read()[:50]
                        print "note: ", note
                        server.dele(1) #this will delete the message after you read it
                        server.quit()
                #-------------------------------------------
                        if note !="":
                                import MySQLdb
                                db = MySQLdb.connect(host="localhost", user="USERNAME", passwd="PASSWORD",db="DATABASENAME")
 
                                cur2 = db.cursor()
                                if name:
                                        cur2.execute("INSERT INTO note (note, name) VALUES (%s, %s)", (note, name))
                                else:
                                        cur2.execute("INSERT INTO note (note) VALUES (%s)", (note))
                except:
                        i+=1
                        #print "Unexpected error:", sys.exc_info()[0]
                        time.sleep(1800)
except:
        print "Failed Unexpectedly"

Python Torrent Search and Download (TPB)

This python command line utility will search the pirate bay for a specific search string and pull out the torrent files and download them to your current directory.

by: Daniel Folkes

This is Licensed under GPLv3. Give Back.

tpb.jpg

    Download Source

  1. print “””Pirate Bay Torrent Downloader – Command Line Interface
  2. Program Written by: Daniel Folkes
  3. website: http://danfolkes.com
  4. email: danfolkes @t gmail dot c0m
    Continue reading “Python Torrent Search and Download (TPB)”