Document Type : Research Paper

Authors

1 Department of Aerospace Engineering, Faculty of Mechanical and Aerospace Engineering, Islamic Azad University, Science and Research Branch, Tehran, Iran

2 Department of Aerospace Engineering, Khajeh Nasir al-Din Tusi University, Tehran, IRAN

Abstract

The purpose of the present paper is to prove the model-free optimal control theory. This theory is derived from the principles of dynamic programming and it is produced for discrete-time systems. The design of the controller depends merely on the I/O data of the controlled planet; hence, the controller is independent of the model. In this paper, two actions have been performed in order to measure the value of the controller. In the first step, the control method was designed to control the attitude of spacecraft. The purpose of this theory was to create a model-free optimal control for the spatial model and to measure the efficiency of the spacecraft systems. Secondly, designing linear quadratic regulator (LQR) controller for attitude control of spacecraft was carried out. The reason for designing this controller was to compare it with model-free optimal control. If the differences between two controllers was proved to be small, then the theory would be proven. Finally, it has been concluded that controller is valuable and acceptable.

Keywords

[1]   Kalman, R. E., “A New Approach to Linear Filtering and Prediction Problems,” Transactions, Vol. 82, 1960, pp. 34-45.
[2]   Karray, F., Gueaieb, W. and Al-Shahram, S., “The Hierarchical Expert Tuning of PID Controllers using Tools of Soft Computing,” IEEE Transactions on Systems Man, and Cybernetics- Part B, Vol. 32, 2002, pp. 77-90.
[3]   Spall, J.C., “Multivariate Stochastic Approximation Using a Simultaneous Perturbation,” IEEE Transactions on Automatic Control, Vol. 45, 1992, p.p. 1839–1853.
[4]   Hou, Z., “The Parameter Identification, Adaptive Control and Model Free Learning Adaptive Control for Nonlinear Systems,” China: (Thesis PhD), Northeastern University Shenyang, 1994.
[5]   Hou, Z. “Nonparametric Models and Its Adaptive Control Theory,” Science Press, Beijing, 1999.
[6]   Hou, Z. and Jin, S.T. “A Novel Data-Driven Control Approach for a Class of Discretetime-Time Nonlinear Systems,” IEEE Transactions on Control Systems Technology, Vol. 19, 2011, pp. 1549-1558.
[7]   Al Tamimi, A., Murad Abu Kh., Lewis, F., “Discrete-Time Control Algorithms and  Adaptive Intelligent Systems Designs,” Texas-Arlington: University of Texas, 20 Werbos, P., "A menu of Designs for Reinforcement Learning Over Time,” In Neural Networks for Control , 1991, p. 67–95.
[8]   Barto, A.G., Sutton, R.S. and Anderson, C.W., “Neuronlike Elements that Can Solve Difficult Learning Control Problem,” IEEE Transactions on Systems Man and Cybernetics, Vols. SMC-13, 1983, pp. 835-846.
[9]   Bertsekas, D.P. and Tsitsiklis, J.N., Neuro-Dynamic Programming, Athena Scientific, 1996.
[10]Howard, R., Dynamic Programming and Markov Processes, Cambridge: Technology Press of  Massachusetts Institute of Technology, 1960.
[11]Bradtke, S., Ydestie, B. and Barto, A., “Adaptive Linear Quadratic Control using Policy Iteration,” Proceedings of the American Control Conference, 1994.
[12]Hagen, S.  and Krose, B., “Linear Quadratic Regulation using Reinforcement Learning.,” in Belgian_Dutch Conference on Mechanical Learning, 1998.
[13]Werbos, P., Approximate Dynamic Programming for Real-time Control and Neural Modeling, New York: Handbook of Intelligent Control: Van Nostrand Reinhold, 1992.
[14]Watkins, C., Learning from Delayed Rewards, (Thesis Ph.D) Cambridge University, 1989.
[15]Prokhorov, D. and Wunsch, D., “Adaptive Critic Designs,” IEEE Transactions on Neural Networks, Vol. 8, 1997, pp. 997-1007.
[16]Landelius, T.,Reinforcement learning and distributed local model synthesis, Sweden: Ph.D. dissertation, Linkoping University, 1997.
[17]Si, J., Barto, A.,  Powel, W. and Wunsch, D., Handbook of Learning and Approximate Dynamic Programming, New Jersey: Wiley, 2004.
[18]Sidi, M.J., Spacecraft Dynamics and Control, Cambridge: Cambridge University Press, 1997.
[19]Navabi, M., Tavana, M. and Mirzaie, H., “Attitude Control of Spacecraft by State Dependent Riccati Equation and Power Series Expansion of Riccati Methods,” Journal of Space Science & Technology , Vol. 7, No. 4, 2015, pp. 39-49.
[20]Rokn Abadi, S., Mir shams, S. and Nikkhah, A., “Spacecraft Optimal Attitude Control by means of Reaction Wheels,” Journal of Space Science & Technology,(JSST), Vol. 2, No. 15, winter 2010, pp. 40-50.
[21]Kirk, D.E., Optimal Control Theory, New York: Mineola, 2004.
[22]Brewer, J., “Kronecker Products and Matrix Calculus in System Theory,” IEEE Trans. on Circuit and System, Vol. 25, No. 9, 1978, pp. 772 - 781.
[23]Kamen, E.W. and Su, J.K., Introduction to Optimal Estimation, Springer, 1999.
[24]Stengel, R.F., Optimal Control and Estimation, Princeton: Dover Publications, 1986.
[25]Terui, F., “Position and Attitude Control of a Spacecraft by Sliding Mode Control,” Proceeding of American Control Conference, 1998.
[26]Wertz, J., Spacecraft Attitude Determination and Control, Reidel, Dordrecht,Netherlands: Astrophysics and space science library, 1978.
[27]Wie, B., Space Vehicle Dynamics and Control, Reston, VA: AIAA Education Series, 1998.
[28]Pukdeboon, C.  and Kumam, P., “Robust Optimal Sliding Mode Control for Spacecraft Position and Attitude Maneuvers,” Aerosp Sci Technol, Vol. 43, 2015, pp. 329–342.
[29]Yang, Y., “Analytic LQR Design for Spacecraft Control System Based on Quaternion Model,” Aerospace Engineering, Vol. 25, No. 3, 2011, pp. 448-453.
[30]Kazantzis, N. and Kravaris, C., “Time-Discretization of Nonlinear Control Systems Via Taylor,” Computers and Chemical Engineering, Vol. 23, 1999, pp. 763-784.
[31]Qinglei, H., Bo, L. and Zhang, Y., “Robust Attitude Control Design for Spacecraft under Assigned Velocity and Control Constraints,” ISA Transactions, Vol. 52, 2013, pp. 480-493.