<?xml version="1.0" encoding="UTF-8"?>
<mods xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.loc.gov/mods/v3" version="3.1" xsi:schemaLocation="http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-1.xsd">
  <titleInfo>
    <title>Robot Motion planning using Deep Reinforcement Learning</title>
  </titleInfo>
  <name type="personal">
    <namePart>Khan, Zahra</namePart>
    <role>
      <roleTerm authority="marcrelator" type="text">creator</roleTerm>
    </role>
  </name>
  <name type="personal">
    <namePart>Supervisor : Dr. Fahad Iqbal</namePart>
  </name>
  <typeOfResource>text</typeOfResource>
  <originInfo>
    <issuance>monographic</issuance>
  </originInfo>
  <physicalDescription>
    <extent>62p. Soft Copy 30cm</extent>
  </physicalDescription>
  <note type="statement of responsibility">Zahra Khan</note>
  <note>Effective motion planning is needed for autonomous robots in challenging environments.
Sampling-based algorithms sample random points in high-dimensional spaces but struggle
to work well in complex environments due to slow convergence and inefficiencies.
Improvements in Deep Reinforcement Learning (DRL) overcome this strategy through
learning optimal policies from acting in the environment, reducing reliance on
environmental data and faster rates of convergence. But DRL resorts to sparse reward
functions to produce suboptimal paths and poor exploration. To advance beyond such
shortcomings, we propose an active SLAM-sourced information reward function. SLAMweighted reward enhances navigation efficiency with richer environment perception and
robustness in unexplored areas. It comprises distance-to-target and smoothness terms over
trajectory that encourage reduced distances, reduced oscillation, and more stable robot
performance. We use the Soft Actor-Critic (SAC) algorithm in our reward function,
because it is best to perform well in new environments. Comparison experiments using
Twin Delayed Deep Deterministic Policy Gradient (TD3) and Deep Deterministic Policy
Gradient (DDPG) in cluttered and sparse environments demonstrated the superior
performance of SAC. In sparse settings, SAC was 100% successful, performing 9.4%
better than TD3 and 14.1% better than DDPG, and using 35% fewer steps than TD3 and
63% fewer than DDPG. In chaotic settings, SAC was 87.5% successful, performing 40.6%
better than TD3 and 71.9% better than DDPG. These performances attest to the
unprecedented effectiveness of the SAC algorithm in the direction of autonomous robots.</note>
  <subject>
    <topic>MS Robotics and Intelligent Machine Engineering</topic>
  </subject>
  <classification authority="ddc">629.8</classification>
  <identifier type="uri">http://10.250.8.41:8080/xmlui/handle/123456789/55263</identifier>
  <location>
    <url>http://10.250.8.41:8080/xmlui/handle/123456789/55263</url>
  </location>
  <recordInfo/>
</mods>
