Saturday, April 21, 2012

Re: Question related to previous lecture.



---------- Forwarded message ----------
From: William Cushing <william.cushing@gmail.com>
Date: Fri, Apr 20, 2012 at 11:25 PM
Subject: Re: Question related to previous lecture.
To: Subbarao Kambhampati <rao@asu.edu>
Cc: Juan Guzman <jcguzma1@asu.edu>


Well I suppose Rao CC'd me since I know a bit about games.  So here's info...but perhaps more than you wanted...

Opponent modeling is (very, very, ...) hard; an opponent can figure out that you've figured them out (or just get worried about the possibility) and suddenly change their style of play (for example, play randomly for a spell).

Last time I checked the case of Poker, which is perhaps as much as 5 years ago (or more!), the state-of-the-art had shifted back to not using opponent modeling, partially because of this sort of issue,
and partially because the state-of-the-art in finding the optimal line of play had advanced.
Specifically the advance that I'm aware of is to use state abstraction in order to produce a smaller, simpler game.
One that can be brute-force solved for the optimal strategy.
In that instance, it beat the then reigning approach based on opponent modeling.

One thing to keep in mind about Poker is that `bluffing' is a part of optimal play---in the Nash sense---and is something that two infinitely computationally powerful and rational agents would do to one another.
In *real-life* poker, the people who make the money are the people who ``play the odds''. 
It is only in popular culture that we imagine Poker to be a game largely about opponent modeling. 
In truth it is a fantastic exercise in relatively complicated statistical theory---and self-control.
The practical variant is of course investing in the stock market.

The most interesting work in opponent modeling that I'm aware of is for Rock Paper Scissors.
Here the Nash strategy is easy to compute, so, in a tournament setting, if you want to win the tournament, you *have* to be better at opponent modeling than everyone else is.
That is, the Nash player(s) for RPS will place in the dead-middle of a tournament.
Those who attempt to win will make profit off of each other; half will place above average, and half below.
The winner of the year I looked at in detail is called Iocaine (a reference to Princess Bride, and descriptive of its algorithm).
You may have seen the relevant Star Trek episode concerning Data playing against a grand master of a made-up game (Stratagus?).
(The lesson is that the only way to lose is to try to win.)

--

Final word on opponent modeling: No one really wants to play in such lopsided settings.
Either the stronger player is teaching, in which case the point is far from trying to punish the student as much as possible for making mistakes,
or a handicap is imposed---which is still a form of teaching, in that the stronger player isn't going to be upset by losing (but they also aren't going to make any hasty or kind moves).
Kasparov against a not-chess-genius 10-year-old at a two-rook and queen handicap, for example, could be excused for taking 10 minutes to think.

Some like to look at a handicap as a challenge in opponent modeling; how can you provoke the weaker player into mistakes, working towards getting rid of their material advantage?
But the better way is just to play normally, and let the mistakes happen naturally, if they do.
If they don't, then play another game, and reduce the handicap :).

Long story short, even when you far outclass your opponent in domain knowledge or raw computational ability, in almost all situations (but not tournament-RPS) it still makes sense to pretend that you don't.

-Will

P.S. Oh, and regarding Deep Blue, basically IBM hired every grandmaster they could get their hands on, especially those with the most experience playing Kasparov, and they collaborated on extensive study of all of his games ever recorded (something such masters of course had already spent years of their life doing), largely to produce what is known as an ``opening book'' geared especially towards playing Kasparov, that is, one containing the greatest pertinent details on every game he has ever played (and has since been analyzed to death).  I think little about the hardware or algorithms was very specifically targeted at Kasparov; after all, how can you?  He wouldn't be best if he had blind spots that anyone else can identify.  Recently much that was mysterious about Deep Blue is no longer `classified', but still much about the whole affair is hush-hush.  Whether or not that opening book helped at all is debatable, obviously the games were not repeats of past games.  From what I understand, the modern assessment is that Kasparov was defeated by his own humanity; the final games he lost in frustration, that is, it seems he should have been able to force a win if not for the unusual stress he had no experience in dealing with.  (The type of stress is hilarious: he couldn't stare-down Deep Blue; his inability to intimidate the opponent befuddled him, or so the story goes.  The poor technician tasked with implementing the computer's moves on the physical board describes his fear and trepidation upon approaching Kasparov quite colorfully.)

Modern chess-playing software is largely held to have surpassed Deep Blue, without resorting to special hardware, nor with any concerns about possibly being opponent-specific.
I don't know what the current reigning champ is, but for a spell I know Fritz was used as a study tool by most grandmasters (read: "Is stronger than.").



On Fri, Apr 20, 2012 at 5:51 PM, Subbarao Kambhampati <rao@asu.edu> wrote:
Opponent modeling is done a lot of time (and for games like Poker, it is pretty much de rigueur. )

It may be less critical in chess where, at least at the grand master level, "optimal opponent" is a reasonable assumption.

I don't off hand know whether Deep Blue did opponent modeling to any extent--the classic reference on 
Deep Blue is this paper:


Rao


On Fri, Apr 20, 2012 at 5:02 PM, Juan Guzman <jcguzma1@asu.edu> wrote:

I had this thought during class, but I wasn't sure if it was necessarily relevant to the lecture.
Do agents like deep blue always assume an optimal opponent such as Kasparov? It seems to me that one would be able to deduce the expertise level of an opponent using the existing information of the game tree and analyzing the opponent's moves. Like the example you gave in class, If Kasparov would take 10 minutes to make a move against a 5 year old, we would consider it silly; If deep blue saw that the opponent kept making moves that were highly in its favor (mistakes) could we use that information to make the agent execute moves more suited to the situation? Rather than assuming that it's playing against a grand master and using min max, we can calculate the possibility of the opponent making an arbitrary (non optimal) move and make "bolder" moves as a result? Or does deep blue already make optimal moves regardless of the skill level of opponents?

- Juan
Sent from my mobile device




No comments:

Post a Comment