Find a band’s musical influences using Python – Pearl Jam and Dr. Dre

I created a fun little program that will go and find the musical influences of any artist. It uses the Spotify APIs and a bit of logic.

The logic works like this:

  1. Find the artist on Spotify
  2. Go and find the earliest album that artist made.
  3. Get the related artists and their first album date
  4. Pick the related artist that has the closest album to 2 years before the original artist
  5. Start back up with #3
  6. Repeat until there are no artists left. Or until the program goes way off the rails.

Note: If there are no related artists with albums older than the oldest it will go back up the chain and try and find another route.

Source Code is on Github here.

Example #1 – Pearl Jam

Peal Jam Demo Mookie
This has some pretty great results. Starting with Pearl Jam’s first album Ten (1991), It goes all the way back to 1953 with decent results like Wire – Pink Flag, The Yardbirds – London 1963.
Continue reading “Find a band’s musical influences using Python – Pearl Jam and Dr. Dre”

Python: Pickle and Unpickle Tree Classifier with Hashing Vectorizer

pickle and python

I took this piece of code out of a project I am working on. I wanted to guess the tag based on keywords in the body of text. So, I take the text, apply a hash vectorizer and then pass the hashed values into a AdaBoostClassifier that uses DecisionTreeClassifier. I wanted to build it once and use it over and over again, so I used Pickle to save it on the file system to reuse.

This code assumes you have a dataframe populated already.

Includes:

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import HashingVectorizer
import pandas as pd
from sklearn.cross_validation import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import AdaBoostClassifier
import pickle
import os.path

Setting up filesystem and parameters stuff:

resetPickle = False
foundPickle = False
"""This is where you would load the dataset"""
df_tags = pd.DataFrame()
pick_model_path1='pickles/modelAdaDecTreeClassifier.pickle'
pick_model_tags_root_pre = 'pickles/model_tag_'
pick_model_tags_root_post = '_DecTreeClassifier.pickle'
tag_pickle_path = pick_model_tags_root_pre + 'PIC' + pick_model_tags_root_post

Create HashingVectorizer. The ngrams 1,2 means that it will use words like “Richmond” and “Richmond VA” as tokens:

vctrizr_tag = HashingVectorizer(ngram_range=(1, 2))

This will check to see if the pickle exists. It will load it into the model if it exists:

if resetPickle == False and os.path.isfile(tag_pickle_path):
    pickle_in = open(tag_pickle_path,'rb')
    model_tag = pickle.load(pickle_in)
    foundPickle = True

If the pickle does not exist, it will go and train the AdaBoostClassifier and save it into the pickle:

if foundPickle == False:
    y_tag = df_tags
    vctr_tag= vctrizr_tag.transform(df_tags['Text'])
    X_tag = vctrizr_tag.transform(df_tags['Text'])
    X_train_tag, X_test_tag, y_train_tag, y_test_tag = train_test_split(X_tag, y_tag, test_size=0.2, random_state=1)
    model_tag = AdaBoostClassifier(DecisionTreeClassifier(max_depth=44),n_estimators=25)
    model_tag = model_tag.fit(X_train_tag, y_train_tag)
    score = model_tag.score(X_test_tag, y_test_tag)
    print('score',score)
    with open(tag_pickle_path, 'wb') as f:
        pickle.dump(model_tag, f)

All together now:

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import HashingVectorizer
import pandas as pd
from sklearn.cross_validation import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import AdaBoostClassifier
import pickle
import os.path
resetPickle = False
foundPickle = False
"""This is where you would load the dataset"""
df_tags = pd.DataFrame()
 
pick_model_path1='pickles/modelAdaDecTreeClassifier.pickle'
pick_model_tags_root_pre = 'pickles/model_tag_'
pick_model_tags_root_post = '_DecTreeClassifier.pickle'
tag_pickle_path = pick_model_tags_root_pre + 'PIC' + pick_model_tags_root_post
vctrizr_tag = HashingVectorizer(ngram_range=(1, 2))
if resetPickle == False and os.path.isfile(tag_pickle_path):
    pickle_in = open(tag_pickle_path,'rb')
    model_tag = pickle.load(pickle_in)
    foundPickle = True
if foundPickle == False:
    y_tag = df_tags
    vctr_tag= vctrizr_tag.transform(df_tags['Text'])
    X_tag = vctrizr_tag.transform(df_tags['Text'])
    X_train_tag, X_test_tag, y_train_tag, y_test_tag = train_test_split(X_tag, y_tag, test_size=0.2, random_state=1)
    model_tag = AdaBoostClassifier(DecisionTreeClassifier(max_depth=44),n_estimators=25)
    model_tag = model_tag.fit(X_train_tag, y_train_tag)
    score = model_tag.score(X_test_tag, y_test_tag)
    print('score',score)
    with open(tag_pickle_path, 'wb') as f:
        pickle.dump(model_tag, f)

Short Domain Name Finder

This is a short url finder I created.
It checks to see if the domain is available remembers.

Short Domain Name Finder

There is a python script in the background that populates a mysql database which displays it with php.

Converting a Magnet Link into a Torrent

UPDATE – 2012-04-30 – User Faless on GitHub has added a good bit of functionality. Check it out.

For some reason, my version of the rtorrent client on ubuntu does not open magnet files. So, I wanted to see if there was a way to create torrent files from magnet files. I couldn’t find a good example, so I wrote my own.

This will convert a magnet link into a .torrent file:

First, make sure your system has Python and the Python Library:

sudo apt-get install python-libtorrent

Then, you can run the following code by calling this command:

python Magnet2Torrent.py

File Magnet2Torrent.py:

'''
Created on Apr 19, 2012
@author: dan
'''
 
if __name__ == '__main__':
    import libtorrent as lt
    import time
 
    TorrentFilePath = "/home/dan/torrentfiles/" + str(time.time()) + "/"
    TorrentFilePath2 = "/home/dan/torrentfiles/" + str(time.time()) + "/" + str(time.time()) + ".torrent"
    ses = lt.session()
    #ses.listen_on(6881, 6891)
    params = {
        'save_path': TorrentFilePath,
        'duplicate_is_error': True}
    link = "magnet:?xt=urn:btih:599e3fb0433505f27d35efbe398225869a2a89a9&dn=ubuntu-10.04.4-server-i386.iso&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A80&tr=udp%3A%2F%2Ftracker.publicbt.com%3A80&tr=udp%3A%2F%2Ftracker.ccc.de%3A80"
    handle = lt.add_magnet_uri(ses, link, params)
#    ses.start_dht()
    print 'saving torrent file here : ' + TorrentFilePath2 + " ..."
    while (not handle.has_metadata()):
        time.sleep(.1)
 
    torinfo = handle.get_torrent_info()
 
    fs = lt.file_storage()
    for file in torinfo.files():
        fs.add_file(file)
    torfile = lt.create_torrent(fs)
    torfile.set_comment(torinfo.comment())
    torfile.set_creator(torinfo.creator())
 
    f = open(TorrentFilePath2 + "torrentfile.torrent", "wb")
    f.write(lt.bencode(torfile.generate()))
    f.close()
    print 'saved and closing...'
 
#Uncomment to Download the Torrent:
#    print 'starting torrent download...'
 
#    while (handle.status().state != lt.torrent_status.seeding):
#        s = handle.status()
#        time.sleep(55)
#        print 'downloading...'

This will create a folder inside of ‘/home/dan/torrentfiles/’ with a structure like:

/home/dan/torrentfiles/545465456.12/545465456.12.torrent

I added this to GitHub if you want to Fork it.
https://github.com/danfolkes/Magnet2Torrent

ENJOY!

IP Locator Webservice – PHP – Ipmap – Command Line

This is similar to my Python script here:
http://danfolkes.com/index.php/2009/04/29/ipmapcom-python/

It uses this sites service to pull the location of each user:
http://www.ipmap.com/

It outputs in XML, Plain Text, and HTML.
Fields:

  • ip
  • hostname
  • ipreverse
  • country
  • region
  • city
  • blacklist
  • gmaps

HERE IS THE LINK TO THE WEB SERVICE SITE:
http://www.danfolkes.com/ipmaps/

IPMap Python Ip Address Locator Command Line Script

ipmap python ip location geocode

This program uses this site IpMap to get peoples location based off of their IP address.

It’s written in python. Enjoy.

Download SourceGPLv3 Code. Give back.
Usage:
python ipmap.py 74.125.45.100 all
python ipmap.py 74.125.45.100
python ipmap.py (This will get you the help screen)

Args:
all = Prints all details
nomap = Gets All, no map
loc = Gets: Country, Region, City
Continue reading “IPMap Python Ip Address Locator Command Line Script”

Python – Cell Phone Number Pad Input V2

This is a rewrite of my original post. This rewrite was made by Rami Davis [ramidavis at y a h o o .c o m] XDA Dev Forums.

It would go perfect with this:
http://www.flickr.com/photos/svofski/3383950702/in/pool-make

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
#!/usr/bin/python3
#using python3
 
import time
 
#Cellphone keyboard imitation
cell_keyboard = {
    "0" : (" "),
    "1" : ("!","@","#","$","%","^","&","*","(",")"),
    "2" : ("a","b","c","A","B","C"),
    "3" : ("d","e","f","D","E","F"),
    "4" : ("g","h","i","G","H","I"),
    "5" : ("j","k","l","J","K","K"),
    "6" : ("m","n","o","M","N","O"),
    "7" : ("p","q","r","s","P","Q","R","S"),
    "8" : ("t","u","v","T","U","V"),
    "9" : ("w","x","y","z","W","X","Y","Z"),
    "#" : (" "), #add something here
    "*" : (" ")
}
 
THRESHOLD = 1.0         # Constant: Seconds before resetting the keyboard
user_input = ""         # Current user input
last_key = ["", 0]         # Last key and repetitions.
 
print("Valid characters: 0-9 # * and q to quit")
 
while not user_input == "q":
    try:
        #Measure the time it takes to respond.
        response_time = time.time()
        user_input = input(">>")[0]
        response_time = time.time() - response_time
    except IndexError:
        user_input = ""
 
    #Check if it's valid
    if user_input in cell_keyboard:
 
        #If it matches the last key
        if user_input == last_key[0]:
 
            # And it was within threshold
            if response_time < THRESHOLD:
                last_key[1] += 1
            else:
                last_key[1] = 0
        else:
            # Assign the new key and 0 repetitions
            last_key[0] = user_input
            last_key[1] = 0
 
        #Now request the element on the keyboard.
        try:
            print(cell_keyboard[last_key[0]][last_key[1]])
        except IndexError:
            last_key[1] = 0
            print(cell_keyboard[last_key[0]][last_key[1]])
 
    else:
        print("Not a valid key.")

Thanks Rami!

Python: Grab Email from Gmail and Insert into MySql Database

This script will:

  • Log into Gmail Pop
  • Read the email
  • Delete the read email
  • Insert the email’s text into a MySql database
  • Sleep for 1800 seconds, and repeat
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
try:
        import poplib, sys, time
        import string, random
        import StringIO, rfc822
        import datetime
        SERVER = "pop.gmail.com"
        USER  = "gmailusername"
        PASSWORD = "gmailpassword"
        i = 0;
        print """
        |------------------------------------------|
        |  This is a python program that checks a  |
        |  POP account and if there is a message,  |
        |  it adds it to the SQL server.           |
        |------------------------------------------|
                  by: Daniel Folkes
                         email: [email protected]
 
        (every 180 seconds)
        Checking POP server....
"""
        while 1:
                try:
                        server = poplib.POP3_SSL(SERVER, 995)
                        server.user(USER)
                        server.pass_(PASSWORD)
                except:
                        print "error setting up server."
 
 
                resp, items, octets = server.list()
                # download a random message
                try:
                        id, size = string.split(items[0])
                        resp, text, octets = server.retr(id)
 
                        text = string.join(text, "\n")
                        file = StringIO.StringIO(text)
                        note = ""
                        name = ""
                        message = rfc822.Message(file)
                        for k, v in message.items():
                                if k=='from':
                                                name = v[:12]
                        note = message.fp.read()[:50]
                        print "note: ", note
                        server.dele(1) #this will delete the message after you read it
                        server.quit()
                #-------------------------------------------
                        if note !="":
                                import MySQLdb
                                db = MySQLdb.connect(host="localhost", user="USERNAME", passwd="PASSWORD",db="DATABASENAME")
 
                                cur2 = db.cursor()
                                if name:
                                        cur2.execute("INSERT INTO note (note, name) VALUES (%s, %s)", (note, name))
                                else:
                                        cur2.execute("INSERT INTO note (note) VALUES (%s)", (note))
                except:
                        i+=1
                        #print "Unexpected error:", sys.exc_info()[0]
                        time.sleep(1800)
except:
        print "Failed Unexpectedly"

Python: Wunderground Todays Weather to Email SMS to Phone

wunderground

I could see somebody setting this as a cron task to send every morning so when you wake up, you get the current weather as a text message. 

I am about to set it up. 

Enjoy
————————————————————

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
import urllib2
import time
ZIP = "20190"
ACCOUNT = "d37"  # put your gmail email account name here
PASSWORD = "neah"  # put your gmail email account password here
to_addrs = "[email protected], [email protected]"
subject = "Wunderground Email"
 
try:
	f = urllib2.urlopen('http://www.wund.com/cgi-bin/findweather/getForecast?query='+ZIP)
	page = f.read()
	i = page.find('<div id="main">')
	page2 = page[i:]
	i = page2.find("<span>")+6
	page = page2[i:]
	i = page.find("</span>")
	temperature = page[:i]
 
	i = page.find("<h4>")+4
	page = page[i:]
	i = page.find("</h4>")
	current = page[:i]
 
	i = page.find("<span>")+6
	page = page[i:]
	i = page.find("</span>")
	wind = page[:i]
 
 
	i = page.find("<span>")+6
        page = page[i:]
        i = page.find("</span>")
        dewpoint = page[:i]
 
 
	i = page.find("<b>")+3
        page = page[i:]
        i = page.find("</b>")
        pressure = page[:i]
 
 
        i = page.find('<div class="b">')+15
        page = page[i:]
        i = page.find("</div>")
        humidity = page[:i]
 
 
        i = page.find("<span>")+6
        page = page[i:]
        i = page.find("</span>")
        visibility = page[:i]
 
 
	page = page[9:]
 
        i = page.find("</span>")+8
        page = page[i:]
        i = page.find("</h5")
        updated = page[:i]
 
 
	i = page.find('<div id="forecast">')
	page = page[i:]
        i = page.find('<span>')+6
        page = page[i:]
	i=page.find('</span>')
	d1n = page[:i]
 
	i = page.find('<div>')+4
	page = page[i:]
	i = page.find('<div>')+5
	page = page[i:]
	i = page.find('</div>')
	d1 = page[:i]
 
	page = page[i:]
	i = page.find('<span>')+6
	page = page[i:]
	i = page.find('</span>')
        d1h = page[:i]
 
	page = page[i:]
        i = page.find('<span>')+6
        page = page[i:]
        i = page.find('</span>')
        d1l = page[:i]
 
	i = page.find('<td')
	page = page[i:]
 
#===================
        i = page.find('<span>')+6
        page = page[i:]
        i=page.find('</span>')
        d2n = page[:i]
 
        i = page.find('<div>')+4
        page = page[i:]
        i = page.find('<div>')+5
        page = page[i:]
        i = page.find('</div>')
        d2 = page[:i]
 
        page = page[i:]
        i = page.find('<span>')+6
        page = page[i:]
        i = page.find('</span>')
        d2h = page[:i]
 
        page = page[i:]
        i = page.find('<span>')+6
        page = page[i:]
        i = page.find('</span>')
        d2l = page[:i]
 
        i = page.find('<td')
        page = page[i:]
 
#===================
        i = page.find('<span>')+6
        page = page[i:]
        i=page.find('</span>')
        d3n = page[:i]
 
        i = page.find('<div>')+4
        page = page[i:]
        i = page.find('<div>')+5
        page = page[i:]
        i = page.find('</div>')
        d3 = page[:i]
 
        page = page[i:]
        i = page.find('<span>')+6
        page = page[i:]
        i = page.find('</span>')
        d3h = page[:i]
 
        page = page[i:]
        i = page.find('<span>')+6
        page = page[i:]
        i = page.find('</span>')
        d3l = page[:i]
 
        i = page.find('<td')
        page = page[i:]
 
#===================
        i = page.find('<span>')+6
        page = page[i:]
        i=page.find('</span>')
        d4n = page[:i]
 
        i = page.find('<div>')+4
        page = page[i:]
        i = page.find('<div>')+5
        page = page[i:]
        i = page.find('</div>')
        d4 = page[:i]
 
        page = page[i:]
        i = page.find('<span>')+6
        page = page[i:]
        i = page.find('</span>')
        d4h = page[:i]
 
        page = page[i:]
        i = page.find('<span>')+6
        page = page[i:]
        i = page.find('</span>')
        d4l = page[:i]
 
        i = page.find('<td')
        page = page[i:]
 
#===================
        i = page.find('<span>')+6
        page = page[i:]
        i=page.find('</span>')
        d5n = page[:i]
 
        i = page.find('<div>')+4
        page = page[i:]
        i = page.find('<div>')+5
        page = page[i:]
        i = page.find('</div>')
        d5 = page[:i]
 
        page = page[i:]
        i = page.find('<span>')+6
        page = page[i:]
        i = page.find('</span>')
        d5h = page[:i]
 
        page = page[i:]
        i = page.find('<span>')+6
        page = page[i:]
        i = page.find('</span>')
        d5l = page[:i]
 
        i = page.find('<td')
        page = page[i:]
 
 
except URLError, e:
	print e.code
	print e.read()
 
#print "Temp:" + temperature
#print "Current" + current
#print "Wind: " + wind
##print "Dew: "+ dewpoint
#print "Pressure:" + pressure
#print "Humid:" + humidity
#print "Visib:" + visibility
#print "Updated: " + updated
 
#print d1n + "-" + d1 + ":" + d1h + "/" + d1l
#print d2n + "-" + d2 + ":" + d2h + "/" + d2l
#print d3n + "-" + d3 + ":" + d3h + "/" + d3l
#print d4n + "-" + d4 + ":" + d4h + "/" + d4l
#print d5n + "-" + d5 + ":" + d5h + "/" + d5l
 
#temperature, current, wind, dewpoint, pressure, humidity, visibility
#	updated, dXn, dX, dXh, hXl (x=1-5, h=high, l=low, n=name)
 
subject += updated
msg = "Now:"+current+"-"+temperature
msg += "\r\nHum:"+humidity
msg += "\r\n"+d1n + "-" + d1 + ":" + d1h + "/" + d1l
msg += "\r\n"+d2n + "-" + d2 + ":" + d2h + "/" + d2l
msg += "\r\nUpdated:"+updated
 
 
 
 
import smtplib
HOST = "smtp.gmail.com"
PORT = 587
 
try:
	server = smtplib.SMTP(HOST,PORT)
	#server.set_debuglevel(1)    # you don't need this
	server.ehlo()
	server.starttls()
	server.ehlo()
	server.login(ACCOUNT, PASSWORD)
	headers = "From: %s\r\nTo: %s\r\nSubject: %s\r\n\r\n" % (ACCOUNT, to_addrs, subject)
	server.sendmail(ACCOUNT, to_addrs, headers + msg)
	server.quit()
except:
	time.sleep(1)

Python – Cell Phone Number Pad Input

Here is the first version of a little python program I made that will translate input from a cellphone text pad or a number pad to text.
UPDATE: VERSION 2
It’s pretty darn simple.

It would go perfect with this:
http://www.flickr.com/photos/svofski/3383950702/in/pool-make

Continue reading “Python – Cell Phone Number Pad Input”