First, some definitions
- Host: An existing user.
- Guest: A potential users that has been invited by a host.
- Conversion: In this case, we define conversion as making a payment.
- Cohort: A collection of host that share some attribute. If we are calculating the new user virality coefficient, the shared attribute is that they converted in the same time frame.
- Time limit: To make the data comparable across cohorts, we must look at the same time frame. Here, we use one week. If you are familiar with cohort tables, they show the value across several time periods.
- One week new user virality coefficient = Number of converted guests by an existing user within one week of that user converting themselves
- 1wnuvc=(guests,1w | host)/hosts
What the database is assumed to contain
A typical database structure for an invite program could look something like this:Invite_table
Host_id
Guest_id
Guest_first_payment_date
User table
Id
Sign_up_date
First_payment_date
Step 1
First we define what cohort we want to measure. In this example, we will define a cohort as the week of the first payment.select
date_format(user_table. first_payment_date, '%x-%v')
, count(user_table.first_payment_date)
from user_table
group by 1
By using %x instead of %Y to calculate the year, we make sure to calculate the year correctly even when the year changes.
Step 2
We then calculate the virality coefficient per host within given time frame. Let's say we want to measure the one week virality coefficient. The case statement checks if the difference between the conversion of the guest and the host is less than seven days and only counts those cases.Since the date of the host's first payment is stored in user_table, we need to do a join to be able to calculate the difference.
select
invite_table.host_id
,sum(case when
datediff(invite_table.guest_first_payment_date, user_table.first_payment_date) <=7
then 1
else 0
end) as individual_one_week_vc
from invite_table
join user_table
on user_table.id = invite_table.host_id
group by 1
Step 3
Finally, we combine the two. The virality coefficient for the weekly cohort is calculated as the sum of the individual virality coefficient, divided by the number of hosts in the cohort.
select
date_format(user_table. first_payment_date, '%x-%v')
,count(user_table. first_payment_date)
,sum(conversions.individual_one_week_vc)/count(user_table. first_payment_date) as one_week_virality_vc
from user_table
left join (
select
invite_table.host_id
,sum(case when
datediff(invite_table.guest_first_payment_date, user_table.first_payment_date) <=7
then 1
else 0
end) as individual_one_week_vc
from invite_table
join user_table
on user_table.id = invite_table.host_id
group by 1
group by 1
date_format(user_table. first_payment_date, '%x-%v')
,count(user_table. first_payment_date)
,sum(conversions.individual_one_week_vc)/count(user_table. first_payment_date) as one_week_virality_vc
from user_table
left join (
select
invite_table.host_id
,sum(case when
datediff(invite_table.guest_first_payment_date, user_table.first_payment_date) <=7
then 1
else 0
end) as individual_one_week_vc
from invite_table
join user_table
on user_table.id = invite_table.host_id
group by 1
) conversions
on conversions.host_id=user_table.id
group by 1
No comments:
Post a Comment