Learning What To Memorize: Using Intrinsic Motivation To Form Useful Memory in Partially Observable Reinforcement Learning
| dc.contributor.author | Demir, Alper | |
| dc.date.accessioned | 2023-06-16T12:47:51Z | |
| dc.date.available | 2023-06-16T12:47:51Z | |
| dc.date.issued | 2023 | |
| dc.description.abstract | Reinforcement Learning faces an important challenge in partially observable environments with long-term dependencies. In order to learn in an ambiguous environment, an agent has to keep previous perceptions in a memory. Earlier memory-based approaches use a fixed method to determine what to keep in the memory, which limits them to certain problems. In this study, we follow the idea of giving the control of the memory to the agent by allowing it to take memory-changing actions. Thus, the agent becomes more adaptive to the dynamics of an environment. Further, we formalize an intrinsic motivation to support this learning mechanism, which guides the agent to memorize distinctive events and enable it to disambiguate its state in the environment. Our overall approach is tested and analyzed on several partial observable tasks with long-term dependencies. The experiments show a clear improvement in terms of learning performance compared to other memory based methods. | en_US |
| dc.description.sponsorship | Scientific and Technological Research Council of Turkey [120E427] | en_US |
| dc.description.sponsorship | AcknowledgementsThis work is supported by the Scientific and Technological Research Council of Turkey under Grant No. 120E427. Authors would also like to thank Huseyin Aydin, Erkin Cilden and Faruk Polat for their support. | en_US |
| dc.identifier.doi | 10.1007/s10489-022-04328-z | |
| dc.identifier.issn | 0924-669X | |
| dc.identifier.issn | 1573-7497 | |
| dc.identifier.scopus | 2-s2.0-85148369315 | |
| dc.identifier.uri | https://doi.org/10.1007/s10489-022-04328-z | |
| dc.identifier.uri | https://hdl.handle.net/20.500.14365/889 | |
| dc.language.iso | en | en_US |
| dc.publisher | Springer | en_US |
| dc.relation.ispartof | Applıed Intellıgence | en_US |
| dc.rights | info:eu-repo/semantics/openAccess | en_US |
| dc.subject | Memory | en_US |
| dc.subject | Intrinsic motivation | en_US |
| dc.subject | Partial observability | en_US |
| dc.subject | Reinforcement learning | en_US |
| dc.subject | Agents | en_US |
| dc.title | Learning What To Memorize: Using Intrinsic Motivation To Form Useful Memory in Partially Observable Reinforcement Learning | en_US |
| dc.type | Article | en_US |
| dspace.entity.type | Publication | |
| gdc.author.scopusid | 57549355800 | |
| gdc.bip.impulseclass | C5 | |
| gdc.bip.influenceclass | C5 | |
| gdc.bip.popularityclass | C4 | |
| gdc.coar.access | open access | |
| gdc.coar.type | text::journal::journal article | |
| gdc.collaboration.industrial | false | |
| gdc.description.department | İzmir Ekonomi Üniversitesi | en_US |
| gdc.description.departmenttemp | [Demir, Alper] Izmir Univ Econ, Dept Comp Engn, TR-35330 Izmir, Turkiye | en_US |
| gdc.description.endpage | 19092 | |
| gdc.description.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US |
| gdc.description.scopusquality | Q1 | |
| gdc.description.startpage | 19074 | |
| gdc.description.volume | 53 | |
| gdc.description.wosquality | Q2 | |
| gdc.identifier.openalex | W3208926142 | |
| gdc.identifier.wos | WOS:000937666700001 | |
| gdc.index.type | WoS | |
| gdc.index.type | Scopus | |
| gdc.oaire.diamondjournal | false | |
| gdc.oaire.impulse | 3.0 | |
| gdc.oaire.influence | 2.8965481E-9 | |
| gdc.oaire.isgreen | true | |
| gdc.oaire.keywords | FOS: Computer and information sciences | |
| gdc.oaire.keywords | Computer Science - Machine Learning | |
| gdc.oaire.keywords | Artificial Intelligence (cs.AI) | |
| gdc.oaire.keywords | Computer Science - Artificial Intelligence | |
| gdc.oaire.keywords | Machine Learning (cs.LG) | |
| gdc.oaire.popularity | 4.4075E-9 | |
| gdc.oaire.publicfunded | false | |
| gdc.oaire.sciencefields | 0202 electrical engineering, electronic engineering, information engineering | |
| gdc.oaire.sciencefields | 02 engineering and technology | |
| gdc.oaire.sciencefields | 01 natural sciences | |
| gdc.oaire.sciencefields | 0105 earth and related environmental sciences | |
| gdc.openalex.collaboration | National | |
| gdc.openalex.fwci | 0.0 | |
| gdc.openalex.normalizedpercentile | 0.0 | |
| gdc.opencitations.count | 1 | |
| gdc.plumx.mendeley | 8 | |
| gdc.plumx.newscount | 1 | |
| gdc.plumx.scopuscites | 3 | |
| gdc.scopus.citedcount | 3 | |
| gdc.virtual.author | Demir, Alper | |
| gdc.wos.citedcount | 2 | |
| relation.isAuthorOfPublication | c9c431c0-6d14-4dac-87af-29d85e10ef21 | |
| relation.isAuthorOfPublication.latestForDiscovery | c9c431c0-6d14-4dac-87af-29d85e10ef21 | |
| relation.isOrgUnitOfPublication | b4714bc5-c5ae-478f-b962-b7204c948b70 | |
| relation.isOrgUnitOfPublication | 26a7372c-1a5e-42d9-90b6-a3f7d14cad44 | |
| relation.isOrgUnitOfPublication | e9e77e3e-bc94-40a7-9b24-b807b2cd0319 | |
| relation.isOrgUnitOfPublication.latestForDiscovery | b4714bc5-c5ae-478f-b962-b7204c948b70 |
Files
Original bundle
1 - 1 of 1
