An important scheduling problem is dealt in this project. The unrelated parallel machine scheduling with the objective to minimize the makespan is addressed. This problem has a set of non-preemptive jobs and a set of machines with the following characteristics: i) Each job is allocated to only one machine; ii) There is a processing time to complete each job in a machine; iii) There is a setup time that depends on the sequence of the allocation of the jobs in the machines. The objectives involve allocating all jobs in the machines, minimizing the makespan (or the maximum completion time of the scheduling).

In a recent paper submitted to Computers & Operations Research journal, the authors of this project propose a new meta-heuristic for the unrelated parallel machine scheduling problem. This method, named Q-LNS, combines Adaptive Large Neighborhood Search (Ropke and Pisinger, 2006) with Q-learning (Watkins, 1989; Watkins and Dayan, 1992) reinforcement learning. The details of the Q-LNS algorithm and its source code are available in the MethodsX journal. The set of instances used is available in Scheduling Research (2005). The source code and complete results of the Q-LNS algorithm are also available here.

### References:

Ropke, S., Pisinger, D., 2006. An adaptive large neighborhood search heuristic for the pickup and delivery problem with time windows. Transportation science 40, 455–472.

Scheduling Research, 2005. Scheduling research virtual center. A web site that includes benchmark problem data sets and solutions for scheduling problems. Available in http://www.schedulingresearch.com. Accessed on July 1, 2013.

Watkins, C. J. C. H., 1989. Learning from delayed rewards. Phd thesis, University of Cambridge, England.

Watkins, C. J. C. H., Dayan, P., May 1992. Q-learning. Machine Learning 8 (3), 279–292.