Challenge discussion | Exploration and Exploitation 3

Feel free to ask and discuss anything about the challenge !

105 thoughts on “Challenge discussion”

jamh on March 15, 2012 at 1:25 pm said:

Hi all, I am little confused with the problem itself and the datasets.

What is the context free problem to solve?
What is the context dependent problem to solve?

I mean, should I try to learn a good policy to recommend articles ? or clicks are at random and I need to learn just only the right balance between exploration and exploitation?

Thanks,
jamh.

Log in to Reply
- Jeremie on March 15, 2012 at 1:35 pm said:
  
  Imagine you have a visit on your website and you can choose to highlight one of your articles. The user is described (136 features). You are rewarded if the article you choose is clicked.
  
  Newest articles tends to perform better (that’s why “always last” performs better than random).
  
  Some articles tends to perform better because they are better (that’s why UCB like algorithms without having any usage of user description also performs better than random).
  
  I guess one could combine the two first points in a clever way to have quite good performance without using the user description.
  
  Some articles performs better on some kinds of users (context dependency). That’s the hardest part.
  
  Log in to Reply
jamh on March 15, 2012 at 2:43 pm said:

Hi Jeremie, thanks for your answer. good hints.
But I have more questions.

As I understand, I am not rewarded for selecting an article that the user has clicked but instead that is on the dataset. In this sense what I am learning?

And then, only If I select the same article as in the data (who selected this article?) then I can learn if a user has clicked or not on that article.

What is the connection between the selection of the articles at the presentation and the user click?

Thanks a lot,
jamh

Log in to Reply
- Jeremie on March 15, 2012 at 2:52 pm said:
  
  The complete answer is in this papers: http://www.research.rutgers.edu/~lihong/pub/Li11Unbiased.pdf
  http://hunch.net/~jl/projects/interactive/scavenging/scavenging.pdf
  
  The short one is: do not care about the dataset, consider the problem :
  - you see a user (described) and you have a list of possible articles to show,
  - you choose to display an article of the list,
  - you receive your reward (0 or 1).
  
  Of course you want to maximize your probability of click (which is the score).
  
  Sometimes you do not receive the reward but in that case just consider the user never came to your website.
  
  Log in to Reply
jamh on March 15, 2012 at 3:02 pm said:

Ok let me try:

* I should always try to select the right article because if just by chance this is the chosen one (with 1/30 uniform probability?) then the probability to be rewarded is maximized?

However I think this does explain only the beginning. Is that right?

Thanks,
jamh

Log in to Reply
- Jeremie on March 15, 2012 at 3:24 pm said:
  
  I’m not sure to understand the part about probability maximization, but I could agree
  
  If your policy is to always choose the last element of the list, this basically means your are always choosing the most recent article.
  
  If by chance it was also chosen by the sample collection strategy (which is random uniform) then you will score the corresponding 0 or 1. If not this round of evaluation is discarded.
  
  Log in to Reply
- oliviern on March 15, 2012 at 3:32 pm said:
  
  Hi,
  
  I’ve just updated the post concerning the evaluation process. All the details I could think of are there. Hope it helps
  
  Log in to Reply
jamh on March 15, 2012 at 4:38 pm said:

Thanks a lot!

Log in to Reply
exploreit on March 15, 2012 at 6:57 pm said:

how can differentiate individual users? if for instance, i dont want to recommend an article that the user has already clicked. can i assume that the 136 bit context description is unique per user? or if i choose an article again for the user, will the evaluation count it as a click because at some point in the history, this same user did click on the article.

Log in to Reply
- Jeremie on March 15, 2012 at 7:07 pm said:
  
  We do not have this information… It may be one of 136 boolean features but we do not know which one (and this is interesting to discover it automatically).
  In all cases it won’t count as as click if selected article have been clicked before in the history.
  
  Log in to Reply
  - exploreit on March 15, 2012 at 10:28 pm said:
    
    thanks.
    
    Log in to Reply
Jeremie on March 16, 2012 at 10:13 am said:

I think participants should read this paper :
http://dl.acm.org/citation.cfm?doid=1772690.1772758

Log in to Reply
jamh on March 16, 2012 at 1:44 pm said:

Ok. As I see it now. There are two pontis of view:

1) I can forget about click at all! I just only need to learn the pseudorandom distribution of the selected articles and these will give me optimal returns.

2) I can forget about selection at all! I just only need to learn the right articles (the users clicked on) so when by random choice (1/30) the article matchs the dataset I maximize the probability to get a 1 reward and so maximize return.

What do you think?

Log in to Reply
- Jeremie on March 16, 2012 at 1:51 pm said:
  
  1) No. Even if you guess the peudorandom (good luck) and always choose the article, your reward will be the average value of click (ie 0.0366 so a score of 366)
  
  2) Yes and no. If all users have the same behavior (ie context/description of users is not correlated to click probability) then it’s true. There is strong evidence for this assumption to be false (read http://dl.acm.org/citation.cfm?doid=1772690.1772758 ). Selecting the right article is the object of this challenge.
  
  Log in to Reply
  - bigwhite on May 5, 2012 at 8:28 am said:
    
    as we don’t know anything other than ID about the article , context of users seems useless to me. So I didn’t use context information in my approach ,actually my approach works well.
    
    Log in to Reply
jamh on March 16, 2012 at 3:03 pm said:

Dear Jeremie, sorry for my emails, I think I am approaching to understand the point, but please be patient with me

Reading the evaluation text of the challenge it says that:

evaluation = cr / hr,
where:
cr: is the number of times you got a click (click-rate)
hr: is the number of times you chose the same ad as in the data hit-rate.

So, lets consider the four possiblities (2×2 combiations of events):

event A: you chose the same ad as in the data: match 1 nomatch 0
event B: after A; you got a click 1 or not 0

A0 and B*: nothing happends to your return (0) since evaluation does not change

A1B0: penalty since hr increases but cr remains constant

A1B1: reward? cr increase by 1 but also hr, so?

So, in principle you should try to avoid A1B0 event combinations, that is, avoid making bad recommendations.

However, is this not perhaps a biased system in favor of avoiding bad recomendations against good ones?

Log in to Reply
- Jeremie on March 16, 2012 at 3:26 pm said:
  
  Right. Avoiding making bad recommandations would be cool. But how to do it ?
  In the dataset the choice of the article is independant of the user: this is a random uniform policy.
  
  You can do it trying to find the best possible association between user and probability of click: that’s fine that’s our goal.
  
  You can also try to do it by trying to not chose the same article each time you see a bad user (well in fact you need to be able to recognize a bad user or to have a first initial click and then alway play something else than what is logged). In that case you need to be able to find the seeds of the random number generators. You don’t know the generative method (probably close to a Mersene Twister), you dont know what is the chosen action (only if it matches your choice). You are not allowed to log in order to identify it over several evalautions…
  
  Seed identification doesn’t sound reasonable… and this is not the goal. Moreover final evaluation will be with a different dataset, so it is useless to guess the seeds on the first datas.
  
  To score a maximum it would be simpler to scan the memory to found the dataset and read the right answers (and it’s explicitly forbidden)…
  
  Log in to Reply
jamh on March 16, 2012 at 3:55 pm said:

Hi Jeremie, thank you again for your great clarifications!

I think I now understand ( hope so )

Unless I have an Oracle that tell me if an article will match the logging policy, I should select the article with higher probability to be clicked. So I have to learn such probabilities using the available information, i.e., clicks-feedback and visitor features.

However, I am an Oracle certified professional, so, expect the unexpected! @_@

Log in to Reply
- Jeremie on March 16, 2012 at 4:34 pm said:
  
  Great
  Unexpected is my graal.
  
  Log in to Reply
edjoesu on March 25, 2012 at 8:02 am said:

Any guidelines for the acceptable number of simultaneous submissions? So far I have been assuming that 3-5 is OK but that more than 10 is bad.

Unfortunately most of the algorithms I can think of for this problem have at least one free parameter, and (by design) there is no real offline data.

Thanks,
Ed

Log in to Reply
- Jeremie on March 25, 2012 at 8:09 am said:
  
  3-5 seems fair. In fact the “real rule” is that i dedicated 16 cores on the cluster for evaluations and I’d want that most of the jobs do not stay “pending” more than 30 minutes. If it’s happen I’ll have to limit the number of submission.
  
  Log in to Reply
exploreit on March 25, 2012 at 9:00 am said:

What does it mean when a submission is marked incomplete and there is nothing more in the error/logs file?

I would like to know if the problem was due to
a) timing problem
b) memory usage
c) program ends before input is complete

Log in to Reply
- Jeremie on March 25, 2012 at 9:03 am said:
  
  It can be a) : the process is stopped with no error message)
  It cannot be b) : it would lead to an error
  It can be c) : if your program exits, no error message is reported
  
  Option d) is you modified the log process and you added something at the end of the log file
  
  Log in to Reply
Codessert on March 25, 2012 at 1:23 pm said:

I have the following error from time to time.

Exception in thread “main” java.lang.NoClassDefFoundError: myPolicy/MyPolicy
at exploChallenge.MainCluster.main(Unknown Source)
Caused by: java.lang.ClassNotFoundException: myPolicy.MyPolicy
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
… 1 more

I am wondering is an issue on the evaluation clusters or is it my code?

Log in to Reply
- Jeremie on March 25, 2012 at 1:28 pm said:
  
  In that case first thing to check is your java version : it must be 1.6
  
  Edit : In that case it was linked to a disk management problem on cluster. Contact me if this happens again, I’ll do my best to fix.
  
  Log in to Reply
exploreit on March 25, 2012 at 2:51 pm said:

Is it possible to add an option to add a private comment to a submission so we can tag the submission with some metadata abt the algorithm and the parameter values we used etc. just some text that can be associated with a particular submission and is visible only to the contestant who submitted?

Log in to Reply
- Jeremie on March 25, 2012 at 2:56 pm said:
  
  Humm I’ll think about it but with the current version this is not straightforward to implement this feature (but not impossible). Some participants already do that kind of stuff with the name of their submission (eg submission_algoname_parameters.jar ). Of course if you have 20 parameters, it’s not a very suitable solution.
  
  Edit : When/where do u want to be able to add theses comments? At submission time?
  
  Log in to Reply
  - exploreit on March 25, 2012 at 3:01 pm said:
    
    preferably at submission time – yes. of course i can add these comments offline in a document and note the performance separately after the evaluation finishes. so it is not a critical requirement.
    
    Log in to Reply
  - Jeremie on March 25, 2012 at 6:47 pm said:
    
    I made something
    Test it and give me your feedback
    
    Log in to Reply
    - exploreit on March 25, 2012 at 7:27 pm said:
      
      wow! that was quick. thanks. will try out for my next submissions. my current two submissions are going to breach the timelimit i guess
      
      Log in to Reply
    - Codessert on March 26, 2012 at 5:49 am said:
      
      Thanks a lot! This feature really helps.
      
      Log in to Reply
- krk on March 26, 2012 at 8:20 pm said:
  
  Hi –
  Below is how I did this. Since its easy to get back the *.jar file from various runs; you can download the jar you want, and run a main() method you bury in MyPolicy: java -cp good.jar myPolicy/MyPolicy
  
  package myPolicy;
  import java.util.List;
  import exploChallenge.logs.yahoo.YahooArticle;
  import exploChallenge.logs.yahoo.YahooVisitor;
  import exploChallenge.policies.ContextualBanditPolicy;
  
  public class MyPolicy implements
  ContextualBanditPolicy {
  private ContextualBanditPolicy implementation;
  
  public MyPolicy(){
  // implementation = new SimplePolicy();
  // implementation = new YoungestPolicy();
  // . . .
  implementation = new BestPolicyEver();
  }
  
  @Override
  public YahooArticle getActionToPerform(YahooVisitor visitor,
  List possibleArticles) {
  return implementation.getActionToPerform(visitor, possibleArticles);
  }
  
  @Override
  public void updatePolicy(YahooVisitor visitor, YahooArticle article, Boolean reward) {
  implementation.updatePolicy(visitor, article, reward);
  }
  
  public static void main(String[] args) {
  MyPolicy policy = new MyPolicy();
  System.out.println(policy.implementation.toString());
  }
  
  }
  
  Log in to Reply
  - krk on March 26, 2012 at 8:25 pm said:
    
    Each implementation’s toString() method can be set up to self document algorithm, parameters, etc.
    
    Log in to Reply
jamh on March 25, 2012 at 11:56 pm said:

Hi, it is my perception or the cluster is running now a little bit slower than before?

Log in to Reply
- Jeremie on March 26, 2012 at 7:26 am said:
  
  I guess this is a little bit slower because each process want to access the data (disk acces) and there is a quite large number of running process at the same time. In the submission webpage I added a cluster charge indicator to provide a feedback of the charge.
  
  The run for victory (on new data after 1st of june) will be done alone on the cluster if the submitted algo have some time limit problems.
  
  precision : sometimes when cluster load is orange (9 to 14) your jobs can be sent to the cluster but do not start computation immediately. When the color is red then your process will have to wait. Of course, this extra wait time is not counted in the timelimit.
  
  Log in to Reply
jamh on March 26, 2012 at 6:59 am said:

Ok! It is not my perception,something is going wrong with the server.

Log in to Reply
exploreit on March 27, 2012 at 7:17 pm said:

Hi Jeremie,

Another feature request. I dont know if others are also facing this stupid problem. I seem to notice a bug in my code almost immediately after I submit my solution

Can we have a way of killing our own currently running tasks somehow. I feel guilty about submitting almost immediately again with a small change and the submission with a bug is also running using up valuable server resources and blocking others as well.

-exploreit

Log in to Reply
- Jeremie on March 27, 2012 at 7:22 pm said:
  
  Well… I was waiting for a such request (in last year challenge we was always facing that kind of problem). This is not a straightforward implementation (because of separations between web servers and cluster).
  I’ll think about doing something, but as it is really a lot of mess, u can also ask me for killing a job (providing the full reference of the job as stated in the submission mail xx%yyyyy-zzzz.{zip,jar} (If i’m fed up my manual killing i’ll write an automatic solution )
  
  Log in to Reply
oliviern on March 27, 2012 at 8:19 pm said:

If your bug leads to a java exception (which is not caught) the execution will stop and your algorithm won’t waste any resources. If it doesn’t, you can add a few tests in your code and throw an Error as follows if you notice an unexpected behavior.
if(n==0) throw new Error("division by 0 !"); double a = t / n;

Log in to Reply
Occam's on March 30, 2012 at 12:34 am said:

Hi, Jeremie,

Could you please confirm that the feature #1 of user feature vectors is always 1? I found that not the case for the 100 test cases given. Thanks.

Log in to Reply
- Jeremie on March 30, 2012 at 8:53 am said:
  
  Do not try to gather information in this sample datas : click or not click has been modified, choice possibilities modified and some values attribute too. They are here to show u the shape no more.
  
  Edit : I looked more carefully, you are right feature 1 is always active and on some lines there is very few other feature activated in the same time. So without any guantee I would say when a feature is not present this can be because this is a “missing” value.
  
  Log in to Reply
  - Occam's on March 30, 2012 at 9:11 am said:
    
    Thanks, Jeremie. Just to double check, does this mean “Feature #1 is the constant (always 1) feature” in the real dataset, as stated in the raw data description?
    
    Log in to Reply
    - Jeremie on March 30, 2012 at 9:13 am said:
      
      I was double checking while you were writing this
      So yes 1 is always here
      
      Log in to Reply
      - Occam's on March 30, 2012 at 9:16 am said:
        
        Great, thanks
      - jamh on March 30, 2012 at 3:54 pm said:
        
        Si, can I for instance, modify the python code in order to set the feature #1 automatically always to 1?
      - Jeremie on March 30, 2012 at 3:55 pm said:
        
        jamh : ok
  - Occam's on March 30, 2012 at 9:21 am said:
    
    I am a bit interested in the meaning of “missing”. Does it mean the feature is absent, or whether the feature is active or not is unobservable due to some reason, or both?
    
    Log in to Reply
    - Jeremie on March 30, 2012 at 9:42 am said:
      
      In this dataset “missing” seems to refer to both.
      
      Log in to Reply
jamh on April 1, 2012 at 1:49 pm said:

Hi, it would be possible to get within the information of a submission some kind of indication about the CPU time or elapsed time used by the submission? So we can get how much time the algorithm consumed and the remaining time to the time limit?

It will help a lot

Thanks,

Log in to Reply
- Jeremie on April 1, 2012 at 2:49 pm said:
  
  Ok, If u want you can log it as a new column in the log files (if u have a doubt on how to do it, mail me).
  
  Log in to Reply
  - edjoesu on April 1, 2012 at 8:48 pm said:
    
    I’d be interested to know how to do this.
    
    Log in to Reply
    - Jeremie on April 1, 2012 at 8:55 pm said:
      
      In explochallenge/eval/MyEvaluationPolicy.py change the log method to add a timestamp at the end of the output (separated of the score by a space).
      I’ll not override this modification during evaluation. It should be enough (if needed I’ll do it)
      
      Log in to Reply
exploreit on April 1, 2012 at 4:22 pm said:

quick question: do all contest participants have to submit a writeup to the workshop before the may 7 workshop submission deadline? or is the contest winners’ presentation a separate item on the agenda? also, how many of the top contesting teams are invited for a presentation?

Log in to Reply
- Jeremie on April 1, 2012 at 4:28 pm said:
  
  No workshop papers are different from the challenge.
  The possibility to do a presentation and to write a challenge paper will (probably) be given to the 3 best submissions. It also depends on the originality of the contributions.
  
  About prizes and invitation with paid inscription to ICML, I’m waiting to have more definitive information about sponsorship and amounts needed.
  
  Log in to Reply
edjoesu on April 1, 2012 at 8:49 pm said:

Some of my recent submissions have been returning with errors unexpectedly. They are the same as other submissions that returned successfully with different parameter values so I am pretty sure the error is on the server end. Thought I should let you all know.

Best,
Ed

Log in to Reply
- edjoesu on April 1, 2012 at 8:53 pm said:
  
  They are errors of this type:
  
  /bin/bash: line 0: cd: 59%KF_2_0001-fnf11j8cjko: No such file or directory
  
  Unfortunately I also submitted a file with an actual syntax error, so don’t get thrown off by that.
  
  Log in to Reply
- Jeremie on April 1, 2012 at 8:57 pm said:
  
  Yes u’re right… I saw them. I’m working on it, but I do not undersand what is happening. This is something quite rare which seems to happens with higher probability when there is more than 7-8 jobs on the cluster node. I suspect some webdav synchronization issues.
  
  Log in to Reply
  - edjoesu on April 1, 2012 at 11:35 pm said:
    
    If it helps, the error seems to occur most frequently when two submissions are made within a couple minutes of each other.
    
    -Ed
    
    Log in to Reply
    - Jeremie on April 2, 2012 at 6:46 am said:
      
      Thanks… It confirms my feeling for a webdav bug (I thinking for an i/o lock). I restarted it, let’s see if there is an amelioration
      
      Log in to Reply
      - edjoesu on April 3, 2012 at 2:56 am said:
        
        The problem seems to be fixed. Thanks!
        
        -Ed
jamh on April 2, 2012 at 12:05 am said:

Hi, how about the order in which the features appear in the dataset? In the logreader the order is lost when converted to a binary vector, but in the dataset the features are not trivially ordered.

Log in to Reply
- Jeremie on April 2, 2012 at 6:44 am said:
  
  No, i do not have any information about this. But you are right in the dataset it doesn’t seems ordered (but I don’t know if its related to a trust value of an order of apparition for this user)
  
  Log in to Reply
Codessert on April 2, 2012 at 2:47 am said:

Are we allowed to convert the timestamp into real world dates and learn from that?

Log in to Reply
- Jeremie on April 2, 2012 at 6:38 am said:
  
  Yes (and it can be usefull)
  
  Log in to Reply
  - Codessert on April 2, 2012 at 6:43 am said:
    
    Thanks.
    
    Log in to Reply
edjoesu on April 4, 2012 at 11:26 pm said:

Some clarification questions, answer what you can:

a) Can you tell us anything about whether the number of times an arm is selectable is approximately uniform? In other words can you say anything about the possibility that one arm will be one of the ~30 selectable arms 3 million times while another is selectable only 300,000 times?

b) The articles have a 6-digit id, e.g. 560620 . The visitors have a 10-digit timestamp, e.g. 1317513291. Are the article ids just timestamps with ’1317′ removed? This seems to be suggested in the ‘some remarks’ post. That means that for the test data the visitor timestamp is always /smaller/ than the article timestamp. Can you tell us if this is true in the actual data?

c) You also say “So between two consecutive users possible choices tends to be same but evolves over time.” I think this means that articles with smaller ids will be available towards the beginning of the evaluation, and articles with larger ids will be available towards the end, but that it /isn’t/ necessarily true that at the very beginning I will have the 30 smallest ids and at the very end the 30 largest. Is this right?

Thanks,
Ed

Log in to Reply
philippe on April 5, 2012 at 1:23 pm said:

Don’t get me wrong, but I do not think these information are really important. As one of the organizers, I do not have any confidential information about the data: all the information we have is available on this website; I simply have the whole dataset, so that I can make some stats on them. Anyway, I provide a few hints:
a) no, it’s not. It more or less Gaussian (strictly speaking, it’s not).
Range is 1630 – 107400 displays, mean 42600, sd is around 20000.
The distribution is skewed towards 0.
b) This is really not important (see below). Anyway, I checked and this is not true.
c) To me, ids themselves are really not meaningful; they are just a way to identify entities, track them, and differentiate entities. In my code, I simply follow the flow of data and record new ids as they appear, along with the timestamp at which they appear for the first time. I guess that the ids at the beginning have appeared earlier than the first timestamp in the log; so, the thing has to warm-up to observe truly new ids. But we do not have any information about all that.

Log in to Reply
tmann on April 5, 2012 at 7:58 pm said:

I have a number of jobs that have finished but are still being reported as “in the cluster”.

Log in to Reply
- Jeremie on April 5, 2012 at 8:04 pm said:
  
  Ok… Seems that the webdav bug is back… I’m fixing that.
  
  Edit : should be done. Webdav again (lot of simultaneous submissions)…
  
  Log in to Reply
tvirot on April 6, 2012 at 5:21 am said:

I’m having trouble exporting the jar file. When I tried using build.xml, I got this error message:

../Documents/ExploChallenge/build.xml:21: restrict doesn’t support the nested “name” element.

and when I tried using the export function in the package explorer, I got a bunch of error messages like these:

Could not find source file attribute for: ‘../Documents/ExploChallenge/bin/exploChallenge/Main.class’
Source name not found in a class file – exported all class files in ExploChallenge/bin/exploChallenge
Resource is out of sync with the file system: ‘/ExploChallenge/bin/exploChallenge/Main.class’.

Can anyone help me with this?

Thanks!

Log in to Reply
- Jeremie on April 6, 2012 at 8:10 am said:
  
  Check your java version (1,6), recompile and try to resubmit : it could have been a local bug.
  
  Log in to Reply
tvirot on April 6, 2012 at 2:27 pm said:

In case someone runs into the same problem, I found a solution. My Eclipse built-in ant (version 1.7) is outdated, and all you need to do is to install ant 1.8.

Cheers!

Log in to Reply
edjoesu on April 10, 2012 at 10:48 am said:

Hi, is there any chance numpy can be upgraded to a currentish version? I believe at some point I have had some weird issues where the old version of numpy has different behavior.

Apologies for the inconvenience,
Ed

Log in to Reply
- Jeremie on April 10, 2012 at 11:00 am said:
  
  I’ll try to do it on April 25th (planed release of new LTS). But I cannot guarantee.
  If I do it I will have to stop evaluations for a few hours.
  
  Log in to Reply
  - jamh on April 11, 2012 at 7:04 am said:
    
    It is also possible to hace a local python installed independent of the python used by the system and one can install whatever packages on this local python wihout affecting the system python. However you have to do it manually. This could be a solution.
    
    Log in to Reply
  - jamh on April 11, 2012 at 5:52 pm said:
    
    BTW, I am very happy to have Python collegues here!
    
    Log in to Reply
    - edjoesu on April 12, 2012 at 7:46 pm said:
      
      Likewise – thanks for coding the python evaluator!
      
      -Ed
      
      Log in to Reply
  - jamh on April 25, 2012 at 12:37 am said:
    
    Hi Jeremie, Is today the python day?
    
    Log in to Reply
    - Jeremie on April 25, 2012 at 8:13 am said:
      
      Ubuntu delayed the release of the LTS… So we have to wait too
      
      Edit: Python 2.7 should be available now see the python post
      
      Log in to Reply
      - jamh on May 4, 2012 at 12:21 pm said:
        
        Thanks Jeremie.
        I noticed a marked slowdown with python2.7.
      - Jeremie on May 4, 2012 at 12:30 pm said:
        
        It’s strange because 2.7 is usually faster than 2.6.
        At the installation level I do not see any reason for that
        Could it be linked to scipy use ?
        
        Edit : i made some quick testes.
        - Python 2.7 is slightly faster than 2.6
        - numpy 1.7 is significantly slower than 1.3
jamh on April 15, 2012 at 9:47 pm said:

Hi, maybe some Good Samaritan wants to discuss or help me to understand why or why not should I pay attention to the “age” of an article. I’ve read in the task description that it is very important, however I can’t see why… maybe I am obfuscated with the selection procedure ?

Log in to Reply
- bigwhite on April 16, 2012 at 12:02 pm said:
  
  I also don’t understand why
  
  Log in to Reply
- exploreit on April 16, 2012 at 1:51 pm said:
  
  forget the selection procedure. think of it this way:
  
  1. given a context, each article has a different propensity of being clicked. your job is to select the article with highest chance of being clicked.
  
  2. for the same article, the chance that it will be clicked by a context c is higher earlier in its lifetime.
  
  So, now the task is to somehow model and tradeoff between “appropriateness” and “novelty” of an article, given a context.
  
  Log in to Reply
  - bigwhite on May 9, 2012 at 12:57 pm said:
    
    but I have another interpretation.
    in time t ,a visitor A comes and read an new article B.
    in time t+n , a new visitor C comes and he may treate B as a new article too.Because visitor C had never read B before.
    the novelty of an article may not decrease among different visitor in a short time.
    
    Log in to Reply
EpsilonGreedyRocks on April 17, 2012 at 6:59 am said:

Hello,

I am getting this error:

bash: ./go.sh: No such file or directory

I changed the argument to YahooLogLineReader, but it doesn’t help. I am using the Python version.

Please let me know if I am making a mistake somewhere… Thanks!

Yasin

Log in to Reply
- Jeremie on April 17, 2012 at 7:11 am said:
  
  In your case (I had a look to your zip files) this is because you are not using the build_sumission.sh script.
  Your zip file wants to extract everything in a submission dir. The ./go.sh file must be extracted in the current directory. More precisely the command unzip yourfile.zip -d mydirectory must extract everything in the mydirectory with the go.sh file located at mydirectory/go.sh
  
  Log in to Reply
- Codessert on April 17, 2012 at 6:54 pm said:
  
  Nice name.
  
  Log in to Reply
Codessert on April 24, 2012 at 10:40 am said:

I think according to the current plan the best model of each team gets evaluated in the second pass of the competition.

I would like to ask can we pick the algorithm ourselves? Due to my concerns, I think the best one on the leader-board might not be the best algorithm because of overfitting.

Thanks.

Log in to Reply
- Jeremie on April 24, 2012 at 10:42 am said:
  
  No problem, just provide me the full name (or the file) of the wanted submission after the end of the first part
  
  Log in to Reply
  - Codessert on April 24, 2012 at 10:44 am said:
    
    Log in to Reply
Yildiz on May 15, 2012 at 8:56 am said:

Hi, i have a question about the end of the challange. In the information you say:

“In phase 1, winners will be known at the beginning of June, these winners are strongly encouraged to present their work at the workshop.”
So “who” will be the winners of phase 1? The first X persons? All Submissions above a threshold?

“Phase 2 results will be known only at the workshop, it will be the same procedure of evaluation but with more (and new) data. Participants cannot submit any new algorithm, we will use their best submission of phase 1.”
Will this new data has the same shape, or can for example the number of user-features vary?

And many thanks for the great challange, it’s real fun

Log in to Reply
- Jeremie on May 15, 2012 at 12:03 pm said:
  
  About the workshop we think it’s more about having interesting discussions, to stick to you description I would say X=3
  But we are open minded and if somebody have something new an fun to present then it’s ok.
  
  About the second phase, it will be the same kind of data but maybe a very few features will be removed (I may have a bigger dataset in the next few days)
  
  Log in to Reply
  - Codessert on May 21, 2012 at 11:09 am said:
    
    If the feature dimension of the phase 2 data is different then some submissions from phase 1 might not work.
    
    Well… At least mine won’t work. I hard coded the number dimension
    
    Log in to Reply
    - Jeremie on May 21, 2012 at 11:12 am said:
      
      Do not worry (in fact 2 dims could be missing) so your hard coded value should not be a problem.
      In all cases for all algorithms over “always last” in phase 1, I’ll pay attention to get them working on phase 2 data.
      
      Log in to Reply
exploreit on May 28, 2012 at 8:16 pm said:

I am noticing that random seed variations between runs can cause a swing of upto +/-10 points. since there nearly 10 people at the top within a difference of 20, this could be a significant effect. Do you have any suggestions for fixing this issue? Will the best submissions of each participant be run a number of times and the best/average score taken?

Log in to Reply
- Jeremie on May 28, 2012 at 9:22 pm said:
  
  The final dataset is 4 times bigger so I expect to have lower variance. Anyway if the scores appears to be close, I’ll check with some t-tests
  
  Log in to Reply
- jamh on May 29, 2012 at 12:08 pm said:
  
  Hi Dr. exploreit.
  
  As I understand, for the first part of the challenge, there is no final round. The winner will be the one with higher score passed the deadline.
  
  For the second round I am not clear now how it will be. As I understand there will be a final round with more data as Jeremie explains.
  
  Log in to Reply
  - Jeremie on May 29, 2012 at 12:13 pm said:
    
    Yes it is
    The second round is to avoid all possible overfitting.
    
    Log in to Reply
exploreit on June 1, 2012 at 4:46 pm said:

Hi Jeremie,

Would we get a chance after June 2 to submit code for Phase 2 evaluation (taking care of feature removal, tuning constants etc) ? or would you just use our best submission file for phase 2 data? I did not understand when you said “I’ll pay attention to get them working on phase 2 data”.

Also, what time exactly will the last submission be accepted tomorrow?

Log in to Reply
- Jeremie on June 1, 2012 at 4:55 pm said:
  
  Precise time limit is Samoa Standard Time (that means as long as it is June 2th somewhere on earth you can submit).
  
  After the deadline you cannot submit, but you can indicate me your favorite old submission. This submission will replace your “best” one. About “paying attention” this is about not exclude a submission because of a trivial problem on Phase 2 datas.
  
  Log in to Reply
Ku-Chun on June 3, 2012 at 11:50 am said:

Thank you for hosting such an enjoyable competition, had a great time!

Log in to Reply
- Yildiz on June 4, 2012 at 11:10 am said:
  
  Yeah, thank you very much for the great competition!!!
  
  Will you later publish the initial dataset, or the dataset from phase 2, so we can continue offline? Would be great
  
  thanks again
  
  Log in to Reply
  - Jeremie on June 4, 2012 at 11:46 am said:
    
    It’ll be done after the workshop on the Yahoo! webscope program
    
    Log in to Reply
    - Ku-Chun on June 4, 2012 at 12:14 pm said:
      
      It is really great that Yahoo is willing to release these datasets. Lacking real world data is a big concern for contextual bandit researchers(at least for me it is).
      
      Log in to Reply

Exploration and Exploitation 3

New Challenges, ICML 2012 Workshop – Edinburgh

105 thoughts on “Challenge discussion”

Leave a Reply Cancel reply