<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns="http://www.loc.gov/MARC21/slim">
 <record>
  <leader>04145nam a22004937i 4500</leader>
  <controlfield tag="001">000729036</controlfield>
  <controlfield tag="003">CZ-PrVSE</controlfield>
  <controlfield tag="005">20260120142922.0</controlfield>
  <controlfield tag="007">ta</controlfield>
  <controlfield tag="008">251217t20192019xxua   fr     001 0 eng d</controlfield>
  <datafield tag="STA" ind1=" " ind2=" ">
   <subfield code="a">X-LITERATURA V SYLABECH</subfield>
  </datafield>
  <datafield tag="020" ind1=" " ind2=" ">
   <subfield code="a">978-1-886529-39-7</subfield>
   <subfield code="q">(vázáno)</subfield>
  </datafield>
  <datafield tag="040" ind1=" " ind2=" ">
   <subfield code="a">ABA006</subfield>
   <subfield code="b">cze</subfield>
   <subfield code="c">ABA006</subfield>
   <subfield code="d">ABA006</subfield>
   <subfield code="e">rda</subfield>
  </datafield>
  <datafield tag="072" ind1=" " ind2="7">
   <subfield code="a">519.1/.8</subfield>
   <subfield code="x">Kombinatorika. Teorie grafů. Matematická statistika. Operační výzkum. Matematické modelování</subfield>
   <subfield code="2">Konspekt</subfield>
   <subfield code="9">13</subfield>
  </datafield>
  <datafield tag="072" ind1=" " ind2="9">
   <subfield code="a">519</subfield>
   <subfield code="x">Probabilities and applied mathematics</subfield>
   <subfield code="2">Conspectus</subfield>
   <subfield code="9">13</subfield>
  </datafield>
  <datafield tag="080" ind1=" " ind2=" ">
   <subfield code="a">519.816-022.218</subfield>
   <subfield code="2">MRF_2003</subfield>
  </datafield>
  <datafield tag="080" ind1=" " ind2=" ">
   <subfield code="a">519.85</subfield>
   <subfield code="2">MRF_2003</subfield>
  </datafield>
  <datafield tag="080" ind1=" " ind2=" ">
   <subfield code="a">004.055</subfield>
   <subfield code="2">MRF_2003</subfield>
  </datafield>
  <datafield tag="080" ind1=" " ind2=" ">
   <subfield code="a">004.825</subfield>
   <subfield code="2">MRF_2014</subfield>
  </datafield>
  <datafield tag="080" ind1=" " ind2=" ">
   <subfield code="a">(048.8)</subfield>
   <subfield code="2">MRF_2003</subfield>
  </datafield>
  <datafield tag="099" ind1=" " ind2="9">
   <subfield code="a">519.8BER</subfield>
  </datafield>
  <datafield tag="100" ind1="1" ind2=" ">
   <subfield code="a">Bertsekas, Dimitri P.</subfield>
   <subfield code="7">vut2010439815</subfield>
   <subfield code="4">aut</subfield>
  </datafield>
  <datafield tag="245" ind1="1" ind2="0">
   <subfield code="a">Reinforcement learning and optimal control /</subfield>
   <subfield code="c">by Dimitri P. Bertsekas</subfield>
  </datafield>
  <datafield tag="264" ind1=" " ind2="1">
   <subfield code="a">Belmont :</subfield>
   <subfield code="b">Athena Scientific,</subfield>
   <subfield code="c">[2019]</subfield>
  </datafield>
  <datafield tag="264" ind1=" " ind2="4">
   <subfield code="c">©2019</subfield>
  </datafield>
  <datafield tag="300" ind1=" " ind2=" ">
   <subfield code="a">xiv, 373 stran :</subfield>
   <subfield code="b">ilustrace</subfield>
  </datafield>
  <datafield tag="336" ind1=" " ind2=" ">
   <subfield code="a">text</subfield>
   <subfield code="b">txt</subfield>
   <subfield code="2">rdacontent</subfield>
  </datafield>
  <datafield tag="337" ind1=" " ind2=" ">
   <subfield code="a">bez média</subfield>
   <subfield code="b">n</subfield>
   <subfield code="2">rdamedia</subfield>
  </datafield>
  <datafield tag="338" ind1=" " ind2=" ">
   <subfield code="a">svazek</subfield>
   <subfield code="b">nc</subfield>
   <subfield code="2">rdacarrier</subfield>
  </datafield>
  <datafield tag="504" ind1=" " ind2=" ">
   <subfield code="a">Obsahuje bibliografii a rejstřík</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
   <subfield code="a">In this book we consider large and challenging multistage decision prob- lems, which can be solved in principle by dynamic programming (DP for short), but their exact solution is computationally intractable. We discuss solution methods that rely on approximations to produce suboptimal poli- cies with adequate performance. These methods are collectively referred to as reinforcement learning, and also by alternative names such as approxi- mate dynamic programming, and neuro-dynamic programming. Our subject has benefited greatly from the interplay of ideas from optimal control and from artificial intelligence. One of the aims of the book is to explore the common boundary between these two fields and to form a bridge that is accessible by workers with background in either field. Our primary focus will be on approximation in value space. Here, the control at each state is obtained by optimization of the cost over a limited horizon, plus an approximation of the optimal future cost, starting from the end of this horizon. The latter cost, which we generally denote by ˜J, is a function of the state where we may be at the end of the horizon. It may be computed by a variety of methods, possibly involving simulation and/or some given or separately derived heuristic/suboptimal policy. The use of simulation often allows for implementations that do not require a mathematical model, a major idea that has allowed the use of DP beyond its classical boundaries.</subfield>
   <subfield code="u">https://eclass.uoa.gr/modules/document/file.php/DI437/Reinforcement_Learning_Bertsekas_Draft.pdf</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
   <subfield code="a">Publikace se věnuje řešení rozsáhlých a komplexních vícestupňových rozhodovacích problémů, u nichž je přesné řešení pomocí dynamického programování výpočetně neproveditelné. Autor představuje metody založené na aproximacích, které umožňují nalézt suboptimální, avšak prakticky dobře použitelné řídicí strategie. Tyto přístupy jsou souhrnně označovány jako posilované učení, případně aproximované či neuro-dynamické programování. Kniha systematicky propojuje teorii optimálního řízení s koncepty umělé inteligence a vytváří srozumitelný most mezi oběma oblastmi. Hlavní důraz je kladen na aproximaci hodnotové funkce a na metody využívající simulace, které často nevyžadují explicitní matematický model systému.</subfield>
  </datafield>
  <datafield tag="650" ind1="0" ind2="7">
   <subfield code="a">vícekriteriální rozhodování</subfield>
   <subfield code="7">ph127397</subfield>
   <subfield code="2">czenas</subfield>
  </datafield>
  <datafield tag="650" ind1="0" ind2="7">
   <subfield code="a">matematická optimalizace</subfield>
   <subfield code="7">ph122672</subfield>
   <subfield code="2">czenas</subfield>
  </datafield>
  <datafield tag="650" ind1="0" ind2="7">
   <subfield code="a">optimalizační metody</subfield>
   <subfield code="7">ph171359</subfield>
   <subfield code="2">czenas</subfield>
  </datafield>
  <datafield tag="650" ind1="0" ind2="7">
   <subfield code="a">generativní umělá inteligence</subfield>
   <subfield code="7">ph1268083</subfield>
   <subfield code="2">czenas</subfield>
  </datafield>
  <datafield tag="655" ind1=" " ind2="7">
   <subfield code="a">monografie</subfield>
   <subfield code="7">fd132842</subfield>
   <subfield code="2">czenas</subfield>
  </datafield>
  <datafield tag="650" ind1="0" ind2="9">
   <subfield code="a">multicriteria decision making</subfield>
   <subfield code="2">eczenas</subfield>
  </datafield>
  <datafield tag="650" ind1="0" ind2="9">
   <subfield code="a">mathematical optimization</subfield>
   <subfield code="2">eczenas</subfield>
  </datafield>
  <datafield tag="650" ind1="0" ind2="9">
   <subfield code="a">optimization methods</subfield>
   <subfield code="2">eczenas</subfield>
  </datafield>
  <datafield tag="650" ind1="0" ind2="9">
   <subfield code="a">generative artificial intelligence</subfield>
   <subfield code="2">eczenas</subfield>
  </datafield>
  <datafield tag="655" ind1=" " ind2="9">
   <subfield code="a">monographs</subfield>
   <subfield code="2">eczenas</subfield>
  </datafield>
 </record>
</collection>
