Tool Name: MDP Puzzle Generator Tool Description: First, it is important to note that the title of this tool is slightly misleading. The application allows users to create textual input files for a much broader spectrum of problems, not just for standard MDPs. In fact, the parser this program uses proccesses information for any standard POSG-related problem, including single-agent/decentralized cooperative/non-cooperative MDPs and POMDPs. This program allows users to create an MDP Puzzle input file. Such input files contain information pertaining to MDP problem standards including transitional, observational, and reward probabilities, as well as other standard problem specifications such as the number of agents, whether the problem yields cost or reward, and a list of all possible states, actions, and observations each agent can visit and/or make, respectively. The data from these input files can then be parsed and entered into MDP-solving data structures, which are mostly matricies representing probability tables. A number of algorithms have been implemented to solve the newly-created MDP problems by accessing the aforementioned data structures. Tool Usage: To use this tool, the following source files must be compiled into the same folder or directory: MdpPuzzleGenerator.java PosgInputFileParser.java SpringUtilities.java An additional file, PosgParserTest.java, can also be compiled to verify an input file's correctness. After successful compilation, running the JVM's command on the MdpPuzzleGenerator class will begin the program. The doc folder contains javadoc documentation of the program's API as well as a detailed text file, INPUT FILE SPECIFICATIONS.txt, describing valid (and invalid) user inputs. PLEASE CONSULT THE 'INPUT FILE SPECIFICATIONS.txt' FILE WHEN USING THIS PROGRAM TO AVOID INPUT ERRORS. Successful file generation is dependent on the completion of 3 input screens. The first asks the user to enter data describing the nature of the MDP-related problem. Namely: Title: Here, the user should specify the file name that will contain the entire MDP-related problem specifications. PLEASE NOTE! THE FILE WILL BE CREATED IN THE DIRECTORY FROM WHICH THE Java Virtual Machine WAS CALLED. Discount: Here, the user should specify the discount value that will be applied when solving the problem. This value should be in DOUBLE format. Values: Here, the user should specify whether this problem deals with cost or reward. States: Here, the user should specify the number of all unique puzzle states. Instead of specifying the total number of possible states, the user can enumerate the states with their actual names. The state names should be separated by a single whitespace character (commas are not a valid delimiter). Agents: Here, the user should specify the number of agents the problem will deal with. Selecting 1 agent signifies that the problem is single-agent while specifying 2 or more implies that the problem is decentralized. Start: Here, the user should specify the probabilities that the problem starts in each of the enumerated (or accounted for) states in the States textfield. The number of values specified must be entered DOUBLE FORMAT and must also match the number of states specified earlier. All values must also sum to 1. Again, values should be separted by a single whitespace and commas are not valid delimeters. ALL ACTIONS: Here, similarly to the States textfield, the user should specify the number of all unique actions that can be taken by any of the agents. The user can specify an integral number or enumerate the actions, using a single whitespace as the delimeter. PLEASE NOTE! It may be the case that only some of the agents can take certain actions, but they must all be accounted for in this field. ALL OBSERVATIONS: Here, exactly like the ALL ACTIONS textfield, the user should specify the observations any agent can make. Specifying 1 observation, and later setting the probability of that observation to 1.0 implies that the problem is fully-observable. Once all fields have been entered, the user should click the "Generate Puzzle" button. If all values entered are valid, a new text file with the name specified in the Title field will be generated and a new screen will appear. If any of the values are invalid, the screen shall not change. To determine which field has an invalid entry, please consult the INPUT FILE SPECIFICATIONS.txt file. To clear all values and start again, the user may click the Clear Values button. The second screen asks the user to specify which actions and observations each agent can take and make, respectively. The user can specify that a particular agent can take all possible actions or make all possible observations by selecting ALL from the appropriate list or select any combination of actions and/or observations by holding down the Ctrl button and clicking on each action/observation that is desired. When finished, the user should click on the "Register Parameters" button. PLEASE NOTE! Each list is initially set to a null value, represented to the user by "--". The user must select a different value for every list in the screen in order for processing to take place. The third screen allows the user to specify transitional, observational, and reward probabilities. The user can enter each probability manually or use two keywords, UNIFORM and IDENTITY, to specify uniform values or identity matricies of values (where appropriate). For each probability the user wishes to enter, at least the action must be specified with a value other than the original null value "--". At this point, the user may select values for the other fields or leave them at their null values. Depending on how many non-null values the user specifies in the lists, a value, line of values, or matrix of values must then be entered into the corresponding probabilities textbox. Once all values have been entered, the user should click the corresponding "Update ..." button. The user can enter as many or as few (including 0) probabilities as he or she wishes. PLEASE CONSULT THE 'INPUT FILE SPECIFICATIONS.txt' FILE WHEN USING THIS PROGRAM TO AVOID INPUT ERRORS. Once all values have been entered, the user should click on the "Accept" button. At the end screen, the user can choose to repeat the process and create a new file or to quit the program. ********************************************************************************************************************* ********************************************************************************************************************* ********************************************************************************************************************* Tool Name: Grid Puzzle Generator Tool Description: This program allows users to generate a very specific type of DEC-POMDP problems, namely a partially-observable 2-agent grid world puzzle. A solution to such a problem is a policy that leads both agents to the same square, or location on the grid. An agent's observability is limited to any of the 8 grid locations immediately surrounding the agent's current location as well as the agent's location itself (a total of 9 squares). Both agents observe the same combination of surrounding squares for a total of 2^9 possible observations for each agent. Each agent can only move 1 square in any of the four main cardinal directions (up, down, left, and right) or may choose to stay put in its currently grid location. The problem also allows for stochastic behavior, in the sense that an agent may not always move in the direction it has chosen. Instead, probabilities may be specified for moving the agent in directions other than one specified. Input files, similar to those resulting from running the MDP Puzzle Generator program, are created when the Grid Puzzle Generator program is executed and data is also stored into specific data structures (mostly probability matricies). The data file created also adheres to the standard MDP input file format and can be parsed and applied to a number of algorithms designed to solve this type of MDP problem. PLEASE NOTE! This application is very heavily memory-dependent and its capabilities are therefore very limited with presently available memory resources. It is highly recommended that the program be run on machines with the highest amount of available memory (RAM) and that no other memory-intensive programs be run concurrently. A discussion on memory-dependent program factors is included below in all applicable contexts. Tool Usage: To use this tool, the following source files must be compiled into the same folder or directory: GridPuzzleGenerator.java GridInputFileParser.java After successful compilation, running the JVM's command on the GridPuzzleGenerator class will begin the program. It is also recommended that the tag -Xm be used during the program's execution to maximize the amount of memory allocated to the JVM. The doc folder contains javadoc documentation of the program's API as well as a detailed text file, INPUT FILE SPECIFICATIONS.txt, describing valid (and invalid) user inputs. PLEASE CONSULT THE 'INPUT FILE SPECIFICATIONS.txt' FILE WHEN USING THIS PROGRAM TO AVOID INPUT ERRORS. Successful file generation is dependent on the completion of 3 input screens. The first asks the user to describe the nature of the puzzle grid world. Namely: Title: Here, the user should specify the file name that will contain the entire MDP-related problem specifications. PLEASE NOTE! The file will be created in the directory from which the JVM was called. Rows: Here, the user should specify the number of INTEGER rows the grid world will have. Cols: Here, the user should specify the number of INTEGER columns the grid world will have. Discount: Here, the user should specify the discount value that will be applied when solving the problem. This value should be in DOUBLE format. Values: Here, the user should specify whether this problem deals with cost or reward. Once all fields have been entered, the user should click the "Generate Grid" button. If all values entered are valid, a next text file with the name specified in the Title field will be generated and a new screen will appear. If any of the values are invalid, the screen shall not change. To determine which field has an invalid entry, please consult the console terminal in from which the application is running. PLEASE NOTE! Due to memory constraints on most personal computers, grids larger than 4x4 are not recommended. The second screen asks the user to define the grid world in more detail, specifying the starting location of each agent and any obstacles on the grid world that prevent agents from entering that location on the grid. To specify agents' starting locations, the user should click on the large letter A and drag it over to the correct square on the grid and finally release the mouse button. The same should be done for the second agent as well as for any obstacle(s) that exist on the grid. Once all agents and obstacles have been specified, the user should specify which of the 9 surrounding grid square each agent can observe. To do this, the user should click on any square that should be observed by the agent on the small 3x3 grid in the lower right hand side of the screen. Successful selection can be observed by the grid square's new BLUE background. An observed, in this case, implies that the agent can look at the grid world square adjacent to him (as specified from the small grid) and recognize if the location is vacant or uninhabitable. An uninhabitable grid location is one that either contains an obstacle or is outside the bounds of the grid world. As the grid is processed, a set of unique observations is determined based on the grid's layout (size and obstacles present). The cardinality of this set may be exponentially larger than the number of observation squares specified (up to 2^x, where x is the number of selected observation squares). PLEASE NOTE! The number of observations specified is also a strong memory factor, as both the observational and reward matricies depend on the number of unique observations that can be made by each agent. If a large number of observations is selected, it is recommended that a smaller grid be used. Also, for implementation simplicity, the agent is always assumed to be facing north (the upward direction). As a result, the observations chosen on the observation grid are absolute and bear no relationship to the direction the agent has just come from. Once both agents have been placed on the puzzle grid (along with any number of obstacles), and all agent observations have been specified on the smaller observation grid, the user should click the "Process Grid" button. If the problem size falls within system memory constrains, the second program screen will close and the third program screen will appear. The third screen allows the user to specify the deterministic or stochastic nature of the grid world puzzle. In other words, the user should specify the probability that the agent moves in any of the 4 cardinal directions after selecting to move in a particular direction. For a perfectly deterministic problem, the value for the Correct field should be 1.0 and the rest be 0.0. A popular stochastic example is one in which the agent moves in the direction specified 80% of the time and perpendicular to the direction specified 20% of the time (10% in each of the two directions). To use this format, the user should enter 0.8 in the Correct textfield and 0.1 in the Perp. Left and Perp. Right textfields. The user can also specify the probablity that the agent will not move at all when a move in any of the four cardinal directions has been chosen. PLEASE NOTE! It is assumed that the agent will NEVER MOVE if the original action specified that the agent not move from its current location. This feature avoids situations in which two agents are adjacent to one another and move apart contradictory to their specified actions. Essentially, this feature increases the probability of the problem being solvable. Once all values have been entered, the user should click on the "Finish and Exit" button.