Building an Agent Based Simulation for Conjoint Analysis

In college I read about the advantages of conjoint analysis over the more intuitive method of using a Likert scale–the familiar rate-this-thing-from-one-to-five or whatever scale. It turns out that people get bored with Likert scales, and end up either reporting everything as extremes, or the median. It has been shown that you can get a better reading on people by asking them about their preference regarding two items. In this post I’d like to share the beginning of a framework for modelling these sorts of situations. Specifically, I’d like to model agents with specific behaviors, and see if those behaviors are apparent through conjoint analysis, i.e, I’d like to test conjoint methods under different controlled circumstances.

We will begin, as we always do, by importing stuff.

import random
import pandas
import scipy.stats

Next, we’ll build an agent. In this example we’re considering an agent’s preference over different magazine titles. Given a list of titles, the agent will randomly pick one or more of those titles. Next, we will randomly assign utilities to those titles. We’ll first pick a subset of the interval [5,25] dollars, representing the agent’s price range for any one title, then we’ll draw utilities from that interval. Thus, some agents will have narrow price ranges, while others will have broad price ranges. Finally, we’ll map titles to utilities, the price beyond which an agent would not pay for the title.

class Agent:
    def __init__( self, titles ):
        # pick some number, n, of titles
        self.n = random.randint(1,len(titles))
        # pick n specific titles
        self.titles = random.sample( titles, self.n )
        # pick a subset of [5,25]
        u = np.sort( scipy.stats.uniform(5,20).rvs(2) )
        # draw uniformly from the subset of [5,25]
        u = scipy.stats.uniform(u[0],u[1]-u[0]).rvs( self.n )
        # build a dictionary of title -> utility
        self.utility = dict( zip( self.titles, u ) )

Next, we’ll subclass Agent into Hunter and Bro. Both Hunter and Bro have tendencies to value Field and Stream and Men’s Health, but Hunters usually favor Field and Stream more, and Bros typically favor Men’s Health more.

class Hunter( Agent ):
    '''
    Values both Men's Health and Field and Stream,
    but more likely to favor Field and Stream
    '''
    def __init__( self, titles ):
        Agent.__init__( self, titles )
        # behavior rule
        if "Field and Stream" in self.titles:
            self.utility["Field and Stream"] *= 1.25
        else:
            u = scipy.stats.uniform(5,20).rvs()
            self.utility["Field and Stream"] = u
        if "Men's Health" in self.titles:
            self.utility["Men's Health"] *= 1.1
        else:
            u = scipy.stats.uniform(5,20).rvs()
            self.utility["Field and Stream"] = u

class Bro( Agent ):
    '''
    Values both Men's Health and Field and Stream,
    but more likely to favor Men's Health
    '''
    def __init__( self, titles ):
        Agent.__init__( self, titles )
        # behavior rule
        if "Men's Health" in self.titles:
            self.utility["Men's Health"] *= 1.25
        else:
            u = scipy.stats.uniform(5,20).rvs()
            self.utility["Men's Health"] = u
        if "Field and Stream" in self.titles:
            self.utility["Field and Stream"] *= 1.1
        else:
            u = scipy.stats.uniform(5,20).rvs()
            self.utility["Field and Stream"] = u

Next we’ll define the titles, create some generic Agents, some Hunters and Bros and mix everyone up in a population.

titles = [ "Men's Health", "Field and Stream", "Women's Digest", "Cosmopolitan", "The New Yorker" ]      
pop = [ Agent( titles ) for i in range(100) ]
hunters = [ Hunter( titles ) for i in range(100) ]
bros = [ Bro( titles ) for i in range(100) ]
pop = pop + hunters + bros
random.shuffle( pop )

Next we’ll write a function to assess the ground-truth or gold-standard of the population statistics, this will be a pandas DataFrame that we can query later.

def gold_standard( population, titles ):
    N, p = len(population), len(titles)
    z = np.zeros(( len(population), len(titles) ))
    for i in range( N ):
        for j in range( p ):
            if titles[j] in population[i].titles:
                z[i,j] = population[i].utility[ titles[j] ]
    z = pandas.DataFrame( z, columns=titles )
    return z 

Next we’ll define a questionnaire that we pose the members of our population, asking them whether they prefer one title or another. This internally queries the utilities we assigned to different titles in the Agent class.

def questionmaire( a, t0, t1 ):
    if( t0 in a.titles )and( t1 in a.titles ):
        t0_value = a.utility[ t0 ]
        t1_value = a.utility[ t1 ]
        if t0_value > t1_value:
            return t0
        elif t0_value < t1_value:
            return t1
        else:
            return "Ambivalent"
    elif( t0 in a.titles ):
        return t0
    elif( t1 in a.titles ):
        return t1
    else:
        return "No Interest"

Next we’ll define a “study” that surveys a random sample of the population with the questionnaire, and reports the results as percentages.

    
def study( population, size, t0, t1 ):
    if size < 1:
        size = np.round( len( population ) * size )
    sample = random.sample( population, size )
    ans = map( questionaire, sample, [t0]*size, [t1]*size )
    uni = np.unique( ans )
    return { answer:ans.count(answer)/float(size) for answer in uni }

Next, we can repeatedly apply studies…

N = len( titles )
m = np.zeros(( N, N, 4 ))
for k in range( 4 ):
    for i in range( N ):
        for j in range( i+1, N ):
            outcome = study( pop, 50, titles[i], titles[j] )
            m[i,j,k] = outcome[titles[i]]
            m[j,i,k] = outcome[titles[j]]

And then plot some somewhat unhelpful results.

fig, axs = subplots(2,2)
axs[0,0].matshow( m[:,:,0], vmin=0, vmax=1 )
axs[0,1].matshow( m[:,:,1], vmin=0, vmax=1 )
axs[1,0].matshow( m[:,:,2], vmin=0, vmax=1 )
axs[1,1].matshow( m[:,:,3], vmin=0, vmax=1 )

In the future I’ll do some more work regarding conjoint analysis, but I feel like this is a decent start on building a simulation for testing.