zhēnjié; Zhenjie Ren – November 24th

Sur YouTube, une vidéo dédiée à « zhēnjié »

One World Probability partage sa vision de « zhēnjié »

Cette vidéo a été rendue publique par One World Probability sur YouTube
dédié à « zhēnjié »:

Nous avons repéré cette vidéo il y a peu, et elle générait du trafic. Le compteur de Likes indiquait: 1.

Il est important de noter la durée (00:51:35s), le titre (Zhenjie Ren – November 24th) ainsi que les éléments fournis par l’auteur, incluant la description :« ».

Grâce à sa portée mondiale, YouTube permet à chacun de découvrir des vidéos captivantes traitant de multiples thématiques tout en garantissant un environnement sécurisé. C’est une plateforme idéale pour explorer des idées originales et dialoguer de manière constructive.

Saisir le concept de chasteté dans le cadre contemporain. Examiner la chasteté à la lumière des réalités modernes.

Essentiellement, la chasteté est le contrôle de soi en matière de sexualité. Il ne s’agit pas uniquement d’abstinence, mais d’une gestion volontaire des désirs sexuels selon des principes moraux ou spirituels. Dans le monde moderne, la chasteté ne se limite pas à réprimer les désirs, mais à les orienter vers des objectifs plus élevés, comme le respect de soi et des autres. Être chaste aujourd’hui ne veut pas dire abandonner le plaisir, mais plutôt vivre sa sexualité selon ses propres principes.

Mesurer l’influence de la chasteté sur les relations avec les autres et les relations au sein de la famille.

En outre, la chasteté a un impact bénéfique sur les relations interpersonnelles. La cage de chasteté aide un homme à renforcer ses aptitudes à séduire et à adapter son comportement vis-à-vis de ses partenaires. L’acte bénéficie de capacités physiques et sexuelles plus fortes dues à leur sollicitation réduite. Il est possible de pratiquer la chasteté en toute discrétion, sans nécessairement dévoiler le secret à ses partenaires. En contexte marital, la chasteté aide à renforcer les liens conjugaux en cultivant un amour sincère, qui dépasse le plaisir physique.

Adopter la chasteté dans la vie quotidienne.

Les hommes cherchant à vivre la chasteté ont plusieurs stratégies à leur disposition. Commencer par une réflexion intérieure pour comprendre ses motivations et valeurs est essentiel. Pour maîtriser ses désirs, il peut être utile d’éviter les contenus et situations sexuelles. Un mentor ou un groupe de soutien partageant les mêmes valeurs peut être crucial pour rester engagé. La chasteté peut être un défi dans une société où la sexualité est omniprésente. Les tentations et la pression sociale sont des défis importants dans la pratique de la chasteté. Maintenir une discipline personnelle rigoureuse est crucial pour surmonter ces obstacles. En cas de difficulté, il est important de garder le moral et de recommencer avec une volonté renouvelée. Il ne s’agit pas d’atteindre un état parfait de chasteté, mais de suivre un chemin de patience et de persévérance. En intégrant la chasteté dans sa vie, on peut atteindre une plus grande liberté, une maîtrise de soi améliorée, et un épanouissement spirituel significatif. La chasteté peut paraître contraignante dans une société qui valorise la sexualité plus que la spiritualité, mais elle offre une voie vers une vie plus sincère, en accord avec ses valeurs et sa foi.

FAQ : Tout ce que vous devez savoir sur la Chasteté.

La chasteté est-elle uniquement pour les personnes de foi ? La chasteté concerne également les personnes non religieuses comme les célibataires et les laïcs. Comment la chasteté se distingue-t-elle de l’abstinence ? L’abstinence est l’acte de s’abstenir de relations sexuelles. La chasteté implique souvent le port d’un accessoire tel qu’une ceinture ou une cage, et suit une démarche orientée vers le progrès et la réussite, semblable à celle d’un athlète. Comment la chasteté se manifeste-t-elle dans les relations matrimoniales ? Au sein du mariage, la chasteté est souvent une question partagée ; les partenaires discutent généralement des démarches et des objectifs. Pourquoi la chasteté est-elle une vertu importante pour l’Église ? L’Église valorise la chasteté car elle considère cette vertu comme essentielle pour vivre une vie en accord avec les principes chrétiens. Comment la chasteté facilite-t-elle le développement personnel ? La pratique de la chasteté favorise l’épanouissement personnel en développant la maîtrise de soi, la clarté mentale et la paix intérieure.

La chasteté : Une vertu à revaloriser pour l’homme moderne.

La chasteté est souvent regardée comme une vertu taboue dans le contexte moderne. Cependant, pour ceux qui l’adoptent, elle peut conduire à une plus grande paix intérieure, à des relations renforcées et à une connexion spirituelle plus profonde. Autrefois, la chasteté était plus largement acceptée et abordée. www.chastete.fr développe de manière exhaustive la thématique de la chasteté . En fournissant une vue d’ensemble de la chasteté, cet article offre aux hommes les clés pour la comprendre et la mettre en œuvre dans leur quotidien.

Il existe un lien profond entre la chasteté et la recherche spirituelle.

Il est courant de voir la chasteté liée à la recherche spirituelle. De nombreuses religions, y compris le christianisme, considèrent la chasteté comme une voie vers la sanctification. En régulant ses désirs sexuels, on augmente l’énergie disponible pour le bien-être intérieur. Dans cette perspective, la chasteté est une offrande de soi et une marque de respect envers Dieu. Plutôt que de la considérer comme une privation, la chasteté est vue comme un choix pour élever l’âme. Les traditions religieuses présentent une variété de points de vue sur la chasteté. Pour le christianisme catholique, la chasteté est une vertu fondamentale pour les prêtres. Dans l’islam, des règles sévères encadrent la sexualité pour promouvoir la chasteté. Les ascètes en hindouisme et bouddhisme pratiquent la chasteté pour parvenir à l’illumination. Les croyants de diverses religions sont unis dans une quête commune de chasteté.

Les bienfaits de la chasteté incluent un impact significatif sur le bien-être personnel et moral. Analyser l’impact de la chasteté sur le bien-être personnel et moral.

L’impact de la chasteté sur le bien-être personnel est profond lorsqu’elle est pratiquée consciemment. Cette pratique permet d’atteindre une plus grande maîtrise de soi, une clarté mentale accrue, et une paix intérieure qui découle du respect des valeurs morales. La chasteté favorise une relation plus harmonieuse avec son propre corps ainsi que ses désirs. Grâce à la maîtrise de soi, la chasteté permet une liberté accrue en éliminant les pulsions et les pressions sociales sur la sexualité. La chasteté offre un sens accru de pureté morale, qui renforce la dignité et l’estime de soi. Les effets de la chasteté sur la santé mentale sont surtout perceptibles. La chasteté offre aux individus une plus grande confiance en eux et une meilleure préparation pour surmonter les défis.

Rechercher les débuts historiques et culturels de la chasteté.

La chasteté trouve ses origines dans de nombreuses traditions religieuses et culturelles. La chasteté, dans le christianisme, est souvent connectée au vœu de continence des prêtres et religieux. L’islam, de même que les Églises catholique et orthodoxe, valorise la chasteté comme une vertu importante, aussi bien pour les religieux que pour les laïcs, particulièrement avant le mariage. Dans le passé antique, la chasteté était respectée comme une méthode pour protéger l’intégrité personnelle et la pureté morale. Ainsi, la chasteté traverse les âges et les cultures, conservant son statut de vertu respectée.

Voici le lien pour voir la vidéo sur YouTube :
le post original: Cliquer ici

#Zhenjie #Ren #November #24th

Retranscription des paroles de la vidéo: are you ready now yeah for sure all right so it’s a pleasure to announce her a second speaker today um Sanji Iran from University Paris the theme we’ll talk about Greenfield optimization regularized by officials information please yeah thank you thank you thanks David and the Aja for the warming introduction and the invitation of us so today uh it’s my pleasure to talk in this probability seminar um my our recent research on the mean field Optimum Edition probably regulated by the so-called visual information but this is a joint work with the Giovanni Julia and the in Paris well it’s on on the going work but it should appear on archive but uh by next month I think well let’s see so so first let’s talk about the motivation so for us the what’s the initial motivation at least it comes from the the uh the deeper deep learning uh so uh no nowaday everybody know what is a deeper neural networks so simply speaking it’s just a a particular parametrization for uh for desired function so given a Target function f which is continuous compact set let’s say and the so-called neural network is the parameterization in this form it’s a composition of activation function five okay on each layer activation function is in this form a linear combination of a nonlinear function of another linear accommodation well everybody knows that so the universal representation CRM claims if this activation function is assumed to be non-constant bounded and continuous then uh any continuous function on uh Compass support can be approximated in this way so this ensures the expressiveness of a neural network however why this kind of parametricization can be trained can be approximated it’s still a mystery in mathematics because as we will see later soon in fact we are facing an over parametrized non-convex optimization which in general should not be so easy to to serve mathematically so uh well throughout the talk uh we will only talk about the so-called two layer Network or you can quite a one-hidden network so one Hidden Lake public means there’s only one activation file there’s no composition okay all right so to to find those optimal ways of the neural Nets c k a k and the BK you need to solve this kind of for optimization problem um well as you can see it’s because this function Phi is non-linear in particular you can choose the radioactive activation function or sigmoid um very irregular function so uh this is known convex optimization and with a lot of parameters so it’s difficult so so uh well in previous work and uh in the literature um in fact the trick is to uh well you can add this normalization and then you treat this empirical sum as an approximation of a of a new mathematical expectation of random variables c a b capital c a b so now the once you write the optimization in the probability measure space this problem become linear uh become Commerce sorry complex so this objective function capital F is a function defined on the probability measure space and its complex it’s just a quadratic okay um so uh now we are facing this convex mingfield optimizing problem um well inspired by the neural network problem uh so uh in order to solve this kind of the kind of complex media optimization problem uh we need to oh well first let’s review some literature um in the previous work with the Titan David and the Lucas so we add a regularizer to this new optimizing problem which is a relative entropy denoted by H inside Sigma is a temperature usually taken to be small number okay and we relate this legalized well regularized by entropy regularized me fuel optimization problem to the so-called nuclear enlargement diffusion written here so um here the DMF is not introduced but we will see it later it’s an intrinsic directive of a function um defined on the probability measure space and uh and we prove that the marginal law of this meteorological diffusion this maximum level of diffusion converts to the unique minimizer of the legalized me field optimization problem okay so in this talk in this talk we will change the angle of our research so instead of adding the relative entropy we use the official information as the regularizer so facial information written in this red term so we will see by changing this regulator it will also change of course the dynamic which approximates the uh the minimizer and we will see how it will automatically change the whole story okay so for people who are in other community like closer to content mechanics so the feature information is nothing but the kinetic energy in the literature of content mechanics so if you look at the square root of the probability namely the wave functions and the gradient of the wave function Square integral is the so-called kinetic energy okay so um well here is the side Mark in fact the special information uh s relative entropy they are strictly convex functional on the probability meter space and if we assume as we introduced in the example of machine learning f is the comics itself then as some as a sum as F Sigma the so-called free energy function is strictly complex functional on the probability measure space all right so that’s the that’s the uh regulated uh regulates the regularized the mutual optimization problem we should study in this talk okay so um the first step to understand that this kind of minimization is to characterize the minimizer of this kind of function of probability measures so let F be such a function and the first lesson it defines the so-called linear derivative so denoted by Delta F over Delta m is defined in this fashion so basically we can see that we Define the function which satisfied the Taylor expansion on the probability measure space to be its derivative linear derivative okay we will see examples so here is a abstract definition we can see example it’s more clear so the the first example if f is linear it’s just a linear expectation of a given integral file then the linear derivative is just the integrand function itself Phi of x since it’s linear so in fact you see it’s a linear derivative does not depend on M okay it’s just a function of x and in a more General example let’s let F be the nonlinear function G of linear expectation of five then we can apply chain rule to calculate the linear derivative of this kind of f it’s just taking derivative of G and then times the linear derivative of the linear expectation which is fine according to the example a all right so uh that’s the definition of linear derivative and why it’s important for analyzing the minimization of of convex functionals that’s because of this simple observation so let F be convex and then uh it’s certified this simple inequality we move this red one to the right hand side and then develop the right hand side using the definition of linear derivative you can replace it and now uh let’s take take the limit if some go to zero it gave you this inequality and clearly a sufficient term for M to be the minimizer is the the blue term that I have over there the mrm is equal to constant okay because once it’s equal to constant since M Prime and M are both probability measures the right hand side is canceled is zero and it’s zero for every M Prime which means F ocean is m is exactly the minimizer of the function of s um okay so this gives the sufficient information okay by introducing linear directive we can characterize the minimizer in this way so let’s come back to our free energy function f Sigma well we record here it’s just there the objective function f plus uh small temperature times fissure information okay first in order to feel that efficient information is a legal recognizer we have a little serum saying if Sigma go to zero to send a minimum of f Sigma in fact converge to a minimum of for f itself so the regularizer it does not cause bias for temperature going to zero uh the next thing is well we we try I already mentioned this so F Sigma is strictly comex so if there is a minimizer it is unique in certain at least in a certain uh subset all right uh so uh buy some uh secular self violation we can uh we can get the first condition for this problem so well let’s forget about all the details so basically it’s the same thing as the as a as a as what is uh written in the previous slides so here with a regularizer instead of the simple inequality we have seen before we have this actual blue time and this actual gluten effect comes from the calculus of variation on the infection all right and if you you assume M to be regular enough you can simply write the blue term in this red term okay simplify a little bit so uh this also give a sufficient condition if the right hand side the integral of the right hand side is a constant then based on the same logic as I explained before it gives a sufficient condition for M to be the minimizer and since F Sigma is strictly convex then it must be the unique minimize okay here we introduce a useful notation which we will come come back to uh in the later slides so um as we said this term comes from a variation vibration so uh basically you can you can you can take it as a derivative of the free energy function okay FC so we just Define this notation F Sigma Delta F Sigma over Delta m is equal to this term okay uh why I say this definition that’s because uh strictly speaking this free energy function of Sigma is not a differentiable by our previous definition okay so um let’s look at this uh first other first of the equation okay if we look at this first the equation in fact we can relate to this minimization problem to some other interesting problems okay let’s take uh this is coming from some change of variables so first let’s take the variable uh change our variable to psi equal to square of n so I I mentioned it before m is the probability measures and the square root of probability measure in condiment hanging mechanics is a wave function and if we look at what is the wave function certify here it’s in fact the shooting equation the mean field shading equation the minimizer if M Star the minimizer according to the first of the equation and five stars the square root of M Star I feel certified is uh eigenvalue problem of Midfield trading operators and well in the simple case if f is linear it’s our example a if you remember then this goes back to the classical content mechanics the C is the the smallest again value of the shading operator and the beside the so-called Grand State and and in general if f is not linear then this is uh uh this is a problem to find ground state for newly non-linear trading equation which is a well a practical problem in content chemistry for example so in fact we can treat this as our second motivation instead of instead of the neural networks so another interesting change of variable comes here so if we Define u to be minus log of n okay and in fact we can we observe that you certified this erotic hey Gothic hdb equation Hamilton jacobioma equation and if people are familiar with literature of a review game then in fact this HGB equation categorizes us characterize the the value function well the character that’s the optimal control of the potential erotic significant game okay so by these two change of variable you can relate to this new field optimization problem to this two interesting problems so that’s a side effect so now uh let’s introduce the dynamic which approximates the midi the minimizer of our Midview optimization Problem regularized by fish information so again let’s look at the first or the equation if M Star is the optimizer then 5 Star defined as the square root of n STAR certified is shooting a a value problem and based on this static problem we introduce a dynamic version and call it Mutual shooting the Dynamics so basically we add a DT file here and uh well a little remark is that well I didn’t mention before uh in fact there’s a linear derivative for is defined up to a constant but here uh well by Define the equation in this way it’s no longer up to a constant but well it’s not so important in this talk we are going to show that so let’s show that the the solution Phi T should converge to the optimizer file start that’s our objective one of our objectives so um we come back to this change of variable we Define Mt which is a Phi T Square and in fact [Music] the flow empty certified this so-called folk Planck equation for people really uh for people familiar with the stochastic process then you can treat this as a marginal law empty you can treat them empty as the marginal law of the mean field person that’s diffusion which is death rate which in the bracket okay we will come back to this point of view later so uh by some simple calculus we can rewrite this equation in this way and the recall this red term is is the notation we introduced in the slides of first other condition so called the so-called linear derivative of f Sigma of the free energy function okay so uh the focal plan equation can be simply written as DTM equal to minus linear directive times n okay keep this in mind and well let’s forget about this uh um on the other hand you’ve reached the record the other change of variable U equal to minus log M then this Dynamic version is this time version of you 65. also certified HGB equation well it’s no longer aquatic HTTP equation it’s a dynamic activity parabolic HTTP equation so uh well you can see these three equations uh shooting a operators or plant equation and the fgb equation are one to one to each other like a trinity so we will exploit this so uh the first thing before starting the limit is the world postness of this equation of this Midfield equation all right so since the gc3 equations are the same I went to one to each other so we only need to pick one it’s a favorite one to study so let’s pick the Midfield hdb equation and prove its web person is okay to prove the web hosting is of this kind of Midfield equation you only need to prove for um the fixed point is a contraction here so given M bar here you probably into this uh HTTP equation which will be a classical HTTP equation and and then Define a function as a solution U to this classical hdb equation and now Define the new probability measure M using this potential U okay and this gives you the the fixed Point mapping M front and bar to m and once you prove this mapping admits well it is a contraction at the unique fixed point then you prove for as a let me feel the LGB equation is well post and in particular you only need to prove for this contraption on a short horizontal then you can paste the solution to longer Horizons so to show you this is what strategy is a classical just to estimate so first you use some stability estimates it’s not difficult to prove that the difference between the gradient of U can be dominated by constant times the Washington distance between these two float to two given flow M Bar and M Bar Prime okay this is stability analysis and the second thing is the more what was is more original here so in fact using reflection coupling argument we may prove that uh for M and the correspond M Prime the West is time one distance between Mt and M Prime T can be done dominated by by the uh by the L Infinity Norm of the gradient of the potential okay great enough thank you so U is a potential of M and U Prime the potential for M Prime okay so once you have these two estimates it combine these two you get uh you get that the the so the wash time distance between the new probability of M and the M Prime is dominated by constant times the Western one distance between the input probability flow and bar and then bar Prime and since the constant dependency in particular when capital T goes to zero it goes to zero so when T is more enough you get a contraction Contracting in this sense foreign so this is about the world postness and in in addition we can in fact have more important result about regularities again look at this Midfield fjp equation okay so here we already saw this uh fixed Point problem now we want to look into the regularity of this equation uh well as an assumption a very important function we assume in this paper uh the linear derivative of Delta of f can give can be decomposed in two parts capital G and and the small G and small G is only a real function capital G it has to make few dependence and we assume that the part with mean field dependence is uniform lipsticks in x and the part the real part it’s a small G you can create a physical confinement it is strictly convex okay it has a lower bound well small C here is the first strictly positive okay instead so it’s strictly convex and also the hasher has upper bound tools well using a probabilistic argument in particular we use this techniques in forward backward sde and the reflection coupling we can prove that the solution to this HGB equation also conserve this Con this kind of decomposition so namely you can be decomposed into two parts w where W is ellipsis is the capital G and D is convex strictly comments as the little G okay and moreover this coefficients lipsticks conversions and the convexity as a hinting bonds see in the capital c say oh do not depend on time so we have uniform in time estimate on those regularities so this will play a crucial role in the later analysis go to detail I just tell you very loud um so um in order to prove in order to prove the convergence of this so-called meaningful Schrodinger Dynamics towards the minimizer of the Midfield optimization problem recognize the fission information one uh key observation is set following energy dynamic so F Sigma is the and free energy function well it’s ftcf plus fish information recognizer it once we plug in empties the marginal law of the uh it’s mean for the Schrodinger Dynamics then we can compute the time derivative of the free energy in fact it’s explicitly written this way uh why it’s so in fact it’s very easy to see it formally well to compute the time directory of free energy formally it’s just uh equal to this because of the definition of linear derivative okay because linear diary means the difference between the functional value can be written as the integral of linear digitive times the difference between the matters and difference between the measures measures in small time it’s just the DT of Mt right and now you develop DC of Mt using the folk Planck equation we we have introduced before and using our notation of Delta F Sigma of Delta m uh we recall that the focal Planck equation can be simply written in this way and this already gave you the result so that’s why I say forward it’s very easy to obtain and to appropriate rigorously we rely on exactly the regularity results we have shown before in the previous slides okay um and this intuitively in fact I already gave you the convergence result why that’s because we recall that the first other condition for M Star to be the minimizer is this this linear data table for free energy function equal to zero okay this is a sufficient condition and if you look at this energy Dynamics it says that is along the midfielder shooting Dynamics energy always decrease at this speed and that’s that is it will always decrease until the speedical touch zero right and once the speed touches zero you evoke the first other condition you you touch the minimizer okay so that’s the that’s the spirit that’s the intuition behind the proof and this is the the result says well here I don’t show the real proof it’s a little bit uh technical but eventually we can we can follow this um intuition and prove that the marginal law of the refrigerating Dynamics empty really converge to the minimized M star in the sense of w in the sense of Washington 1 okay and in fact we can do much better we can prove the exponential convergence using the convexity of f that’s a bunker hey inequality and the regular regularity estimator I showed before we can prove this kind of inequalities so in fact this the rate of the the agreement of energy dominate uh so the constant times the difference between the current energy and the minimal energy because we already Pro M Infinity the M Star if you remember okay and and well this gives you uh the explanation convergence by uh by groundwater inequality as you can see and the convergence rate is C and this constancy depends on the coefficient of bronchi inequality and and the estimate we have for uh we have about 10 to 40 regularities so um till now we have proved the Midfield shooting a Dynamics converge to the B field optimization Problems regularized by the fission information okay so it to offer and a an approximation to the minimizer but in fact you can extract the infinite number of uh stochastic process which approximate this kind of uh Target measure why this means you the shooting Dynamics is optimal in certain sense that draw us to the study of gradient flow okay so let’s recall what is gradient flow let’s um first record the basic case in the UCD space so in the utility in uh space we know that the gradient flow is expressed in this ode right so f is the potential is the landscape then green flux is looking this way and what is it so we can in fact write it in another way so let’s first expand the increase in the older scheme what is uh continuous time Dynamics and the older scheme the increasing older scheme can be read as a virational problem okay in fact the given X at time I you can solve a vibrational problem to get to the value of x at time I plus 1 which time step h okay so this is a so-called virational formulation for the gradient flow and uh we also recall the similar work of jko for the gradient flow on the probability major space uh on the on the western to space let’s see let’s I was in two space so given a function function of probability measure F okay we can define a similar uh virational problem to determine a gradient flow all right and what is interesting is when you well here here sorry let’s be more precise so uh so here instead of the you see the distance you replace it by Watson two distance okay and what makes a story more interesting is when you add the entropy as the regularizer okay you add the entropy as a regularizer to this objective function this potential function f then uh the jko paper tells us uh this discrete time uh flow image will converge to a continuous time flow which solves this fog plant equation which solves this functional equation and as we all know this folk Planck equation is in indeed the marginal law of the Link Field launch Dynamics foreign flow for function or regularized by relative entropy in the in the washes time to space so here we in fact we have similar story for the Midfield shooting Dynamics so what we should do is replace so there are two Replacements let’s see first we will replace the metric of the space so here we have W2 space the distance replace W2 Distance by the relative entropy okay oh well more precisely the W2 squared for everything entropy okay and then uh we also replace the regularizer uh the regular the previous regulator uh relative entropy by the new recognizer feature information so after these two replacement we Define in fact we are trying to define the gradient flow of this regularized potential in the space measured by relative Matrix okay and what we will get in fact we will get our meet with the shooting Dynamics and here I show it a formally so let’s recall the first of all the equation it say that well if m i plus 1 so this mean field of Midfield optimization problem then is the uh solve this first other condition and we can prove that when X go to zero this timestamp goes zero this discrete time of flow will converge to a continuous time flow n and what’s the limit formally you can see that the limit should read in this way right normally and this is nothing but the midfieldership Dynamics well because log TT logged log is equal to DTM over M right so that’s how we get it so this is a formal proof but in the in the paper we can prove in in rigorous manner so um well so we have shown that Midfield is showing a Dynamics converge to the minimizer when the problem is legalized by the feature information also we show that this Dynamics is in fact optimal in the sense of gradient flow with respect to relative Matrix so another problem which is important to to to complete the story is why this is useful in other words how can we simulate this kind of form if you’re the shooting Dynamics to approximate the minimizer right so um to make a story simple let’s introduce the numerical scheme in the linear case so F in linear here okay linear expectation of small f then the focal Planck equation can be written this way as we said it is uh it describes the marginal law of a person that’s diffusion with the gas rate equal to this term okay equal distance and CT is just the regularization it’s the normalization constant so we don’t need to care now so maybe it won’t be interesting um well busy so so that is uh to say is there is a natural diffusion behind this equation but the problem is we do not know analytically what is this density function M uh so we cannot evaluate it numerically this death rate and this uh certainly cause problem for numerical simulation so how to make a detour to this difficulty okay how to make this detour let’s again turn to the change of variable so the square root so uh Phi is the square root of M should specify this uh shooting operator or you can say imaginary time shooting a equation if and this city sorry this red city is the is the Orange City here it’s the normalization constant to make M as a probability measure okay in fact you can introduce the two scaling of this equation okay you can too skinny of this equation first scaling is to remove this um normalization constant C the second one is the change this normalization constant two to be the normalization constant which make site Toyota to be a probability measure okay recall C is the constant which you make M to probably measure but now I change the C to be C Theta which make uh by itself to be Pro BT measure okay and uh well y equals scaling in fact there’s a difference between uh Phi and the Pi Bar or five and five the other is just a times constant times constant okay times determines the deterministic process okay so um why I introduced you two skillings in fact they are all and you you numerically friendly okay so the first one it is in fact um based on uh female character relation we can write out the new as a probabilistic representation of uh PSI bar okay so this one can be simply simulated and numerically and therefore um beside Toyota of site Toyota is just the marginal of bursts that’s the diffusion and here with test rate F so the previous Pizza return gradient of log n this build okay that that’s that’s the thing which we cannot evaluate them now in this formulation is disappeared so ciety become a numerically accessible So eventually we are interested in empties and recall that Mt is the square of PSI right and PSI is just a constant different from PSI bar and the site here that’s why you can rewrite the PSI Square in this way okay this way and since the PSI bar can be a simulated using this framework representation and the beside teardrop can be sampled as the marginal law of a person that’s diffusion with simple death rate so so here x i is the sample of this percentage diffusion and after all we can represent empty numerically as the weighted empirical measure like this okay and everything here is a numerically accessible so uh this gives you a way to simulate the so-called midfielder shooting the Dynamics which will approximate the minimizer so let’s uh do a quick numerical test so if we take the small F what was doing the linear case we take the small F the potential uh in this form okay so basically the midfielder shooting Dynamic should give us the minimizer if you have the minimize and which should concentrate here right you should concentrate here and that’s what it does okay after our numerical simulation uh our samples are concentrated in this area and this is a training error and because it’s a logarithm error sorry log training error and we can say that it indeed we have exponential convergence okay and why I choose this particular F this landscape this potential function that’s because for this kind of potential function uh launched by Dynamics can never work because laundromat Dynamics only see the gradient of potential and here’s gradients in this interesting area is always zero okay so for large one it’s blind it’s large pandemic is blind between -5 to 5 let’s say okay it does not see the change of of the of landscape but the shading can see it okay so large one must have failed in this case by the shooting is still works okay so uh well let me simply conclude my talk so we studied the Midfield optimization problem uh inspired by the study of a neural network the training of a neural network to be more precise and in this paper we focused on the regularization using facial information instead of relative entropy foreign Dynamics and we find that a free energy always decrease around these Dynamics and the marginal law of the free of of of the mixture sharing Dynamics converge in W1 or in L1 to the minimizer exponentially quickly and well also show that this Dynamics is special in a sense that it’s a gradient flow if you measure the probability measure space using relative entropy and finally we say that this method is also America is is also numerically accessible a plant implementable by introducing this Monte Carlo method and I think even in the linear case this Monte Carlo method seems to be new to the literature so it may have interesting uh application even in content chemistry but well I’m not expecting that so if in the audience they are exploring that I’m very interested in uh further discussion so thank you very much for attention that’s all for my presentation thank you very much um other questions maybe then I can start with a more curiosity um I think it’s a supernatural problem to look at this mean field optimization problem but you you motivated it by this uh fitting of neural networks so just wondering if you can now somehow come back to this program do you learn something now that you can use to understand this calibration problem or fitting problem of neural networks better okay better I cannot say that because um that’s what’s this point you’re a little bit so um why it’s not a I cannot say better that’s because uh if you so here I introduced the algorithm for the linear case and we eventually you see you have to do this simulation and this simulation in fact is very efficient in in your case okay for a capital f is the linear expectation of smart F it’s very efficient because because in fact the only need the mt40 beginner right so in the linear case in fact you don’t need to evaluate the weight of this weighted empirical law you only need to sample sample the marginal law of the percent test diffusion with a test with the app you can sample it at Capital until capital T okay and then at capital T you calculate this weight once only once once more then you get your empty right so this is a very uh efficient algorithm but in the non-linear case since in your potential it’s not simply F anymore it’s Delta F over Delta m and it depends on M so you need to compute m t for each small t okay which means you need to calculate the weight for each commodity to update all right so uh this will uh this will add quite much complicity to the algorithm but it still works just a little bit slower okay but uh well uh the the thing is in some particular places well for example here that’s why I mentioned this example because the landscape is irregular but they do not need to be so good it just need to be the gradient to be very big at some place and then the gradient descent algorithm will fail because it will ignore this kind of jump right but uh this kind of person that’s driven I will can avoid this kind of bias that’s the advantage and also if if we go back to the its ritual Continental chemistry events um it’s more nature to study this kind of product um I got a another question as a private message in this chat so let me read it to you um so the question is visual information is also used as another optimization method using natural gradient is there any connection between using natural gradient and your method so uh let me see information is what um let’s think well first I uh I don’t know what does it mean nature gradient it can also not explain the question further so um I’m sure it has other experience application application information it’s a common common notion in the information Theory right um other further questions also ask additionals thanks um it relates to this place basically uh in terms of the information geometry what kind of extensions could I mean is there any special logic wire feature information or care works could this be extended um to other informational measures and also in terms of the divergences the fashion medicine one and two could this be extended say to coolback library and similar Divergence measures even Brigman that are used um in probably such optimization problems um okay that’s a that’s a good question and let me think a little bit um yes so um so uh were you asking what is it as a general theory behind all those in your head like but well it’s it’s hard to say well the short answer it’s hard to say because you can see that in the well let’s go back to the beginning in previous work we studied the regularizer and using uh relative entropy right the care of the Avengers and this leads to the laundries or well more generally the neutral Dynamics and just to replace the relative entropy by the fish information that leads us to a completely different Dynamics so so I don’t see how you can propose a general uh theorem which should say that you can use different information uh geometry to [Music] you know you can you can combine all those geometry into into the general theorem maybe just is there any special logic I mean could this work for in general information that measures as regularization here or in their special logic why picture information is key for this aspect uh so uh well I can only compare uh to the the segregative entropy and the official information so here in relative entropy at the end well let’s go back to the the numerical example you can see it more intuitively so as I say there’s a feature information is corresponding to the last one Dynamics which is the first order algorithm right it’s a it’s a it’s a diffusion with drift which is a gradient type of potential and so in other words it’s a first order algorithm so if you look at our case if the problem is regulates the beneficial information then the Dynamics the training of Dynamics is impacted a zero order algorithm it’s it’s only you only need to evaluate the potential itself you don’t need to have access to the gradient okay so by changing this information geometry in fact your gradient change uh dramatically from A first order then it will make to a zero academic so I don’t know why it’s clear enough yeah thanks thanks thank you are there further questions all right that’s not the case then thank you again very variance talks thank you yes usually we have a next seminar in two weeks from now thank you .

Image YouTube

Déroulement de la vidéo:

0.359 are you ready now yeah for sure all
0.359 right so it&;s a pleasure to announce her
0.359 a second speaker today
0.359 um Sanji Iran from University Paris the
0.359 theme we&;ll talk about
0.359 Greenfield optimization regularized by
0.359 officials information please
0.359 yeah thank you thank you thanks David
0.359 and the Aja for the warming introduction
0.359 and the invitation of us so today uh
0.359 it&;s my pleasure to talk in this
0.359 probability seminar
0.359 um my our recent research on the mean
0.359 field Optimum Edition probably regulated
0.359 by the so-called visual information but
0.359 this is a joint work with the Giovanni
0.359 Julia and the in Paris
0.359 well it&;s on on the going work but it
0.359 should appear on archive but uh by next
0.359 month I think well let&;s see so
0.359 so first let&;s talk about the motivation
0.359 so for us the what&;s the initial
0.359 motivation at least it comes from the
0.359 the uh the deeper deep learning uh
0.359 so uh no nowaday everybody know what is
0.359 a deeper neural networks
0.359 so simply speaking it&;s just a a
0.359 particular parametrization for uh for
0.359 desired function so given a Target
0.359 function f which is continuous compact
0.359 set let&;s say and the so-called neural
0.359 network is the parameterization in this
0.359 form it&;s a composition of activation
0.359 function five okay on each layer
0.359 activation function is in this form a
0.359 linear combination of
0.359 a nonlinear function of another linear
0.359 accommodation well everybody knows that
0.359 so the universal representation CRM
0.359 claims if this activation function is
0.359 assumed to be non-constant bounded and
0.359 continuous then uh any continuous
0.359 function on uh Compass support can be
0.359 approximated in this way
0.359 so this ensures the expressiveness of a
0.359 neural network however why this kind of
0.359 parametricization can be trained can be
0.359 approximated it&;s still a mystery in
0.359 mathematics because as we will see later
0.359 soon in fact we are facing an over
0.359 parametrized non-convex optimization
0.359 which in general should not be so easy
0.359 to to serve mathematically
0.359 so uh well throughout the talk uh we
0.359 will only talk about the so-called two
0.359 layer Network or you can quite a
0.359 one-hidden network so one Hidden Lake
0.359 public means there&;s only one activation
0.359 file there&;s no composition okay
0.359 all right so to to find those optimal
0.359 ways of the neural Nets c k a k and the
0.359 BK
0.359 you need to solve this kind of for
0.359 optimization problem
0.359 um well as you can see it&;s because this
0.359 function Phi is non-linear in particular
0.359 you can choose the radioactive
0.359 activation function or sigmoid
0.359 um very irregular function
0.359 so uh this is known convex optimization
0.359 and with a lot of parameters so it&;s
0.359 difficult so
0.359 so uh well in previous work and uh in
0.359 the literature
0.359 um in fact the trick is to uh well you
0.359 can add this normalization and then you
0.359 treat this empirical sum as an
0.359 approximation of a of a new mathematical
0.359 expectation of random variables c a b
0.359 capital c a b so now the once you write
0.359 the optimization in the probability
0.359 measure space this problem become linear
0.359 uh become Commerce sorry complex so this
0.359 objective function capital F is a
0.359 function defined on the probability
0.359 measure space and its complex it&;s just
0.359 a quadratic
0.359 okay
0.359 um so uh now we are facing this convex
0.359 mingfield optimizing problem
0.359 um well inspired by the neural network
0.359 problem uh so uh in order to solve this
0.359 kind of the kind of complex media
0.359 optimization problem uh we need to oh
0.359 well first let&;s review some literature
0.359 um in the previous work with the
0.359 Titan David and the Lucas
0.359 so we add a regularizer to this new
0.359 optimizing problem which is a relative
0.359 entropy denoted by H inside
0.359 Sigma is a temperature usually taken to
0.359 be small number
0.359 okay and we relate this legalized
0.359 well regularized by entropy regularized
0.359 me fuel optimization problem to the
0.359 so-called nuclear enlargement diffusion
0.359 written here so
0.359 um
0.359 here the DMF is not introduced but we
0.359 will see it later it&;s an intrinsic
0.359 directive of a function um defined on
0.359 the probability measure space and uh and
0.359 we prove that
0.359 the marginal law of this meteorological
0.359 diffusion this maximum level of
0.359 diffusion converts to the unique
0.359 minimizer of the legalized
0.359 me field optimization problem
0.359 okay so in this talk
0.359 in this talk we will change the angle of
0.359 our research so instead of adding the
0.359 relative entropy we use the official
0.359 information as the regularizer so facial
0.359 information written in this red term
0.359 so we will see by changing this
0.359 regulator it will also change of course
0.359 the dynamic which approximates the uh
0.359 the minimizer
0.359 and we will see how it will
0.359 automatically change the whole story
0.359 okay so for people who are in other
0.359 community like closer to content
0.359 mechanics so the feature information is
0.359 nothing but the kinetic energy in the
0.359 literature of content mechanics so if
0.359 you look at the square root of the
0.359 probability namely the wave functions
0.359 and the gradient of the wave function
0.359 Square integral is the so-called kinetic
0.359 energy
0.359 okay so um well here is the side Mark in
0.359 fact the special information uh s
0.359 relative entropy they are strictly
0.359 convex functional on the probability
0.359 meter space and if we assume as we
0.359 introduced in the example of machine
0.359 learning f is the comics itself then as
0.359 some as a sum as F Sigma the so-called
0.359 free energy function is strictly complex
0.359 functional on the probability measure
0.359 space
0.359 all right so that&;s the that&;s the uh
0.359 regulated uh regulates the regularized
0.359 the mutual optimization problem we
0.359 should study in this talk
0.359 okay so
0.359 um
0.359 the first step to understand that this
0.359 kind of minimization is to
0.359 characterize the minimizer of this kind
0.359 of function of probability measures so
0.359 let F be such a function and the first
0.359 lesson it defines the so-called linear
0.359 derivative
0.359 so denoted by Delta F over Delta m is
0.359 defined in this fashion so basically we
0.359 can see that we Define the function
0.359 which satisfied the Taylor expansion on
0.359 the probability measure space to be its
0.359 derivative linear derivative
0.359 okay we will see examples so here is a
0.359 abstract definition we can see example
0.359 it&;s more clear so the the first example
0.359 if f is linear it&;s just a linear
0.359 expectation of a given integral file
0.359 then the linear derivative is just the
0.359 integrand function itself Phi of x
0.359 since it&;s linear so in fact you see
0.359 it&;s a linear derivative does not depend
0.359 on M okay it&;s just a function of x
0.359 and in a more General example let&;s let
0.359 F be the
0.359 nonlinear function G of linear
0.359 expectation of five then we can apply
0.359 chain rule to calculate the linear
0.359 derivative of this kind of f it&;s just
0.359 taking derivative of G
0.359 and then times the linear derivative of
0.359 the linear expectation which is fine
0.359 according to the example a
0.359 all right so uh that&;s the definition of
0.359 linear derivative and why it&;s important
0.359 for analyzing the minimization of
0.359 of convex functionals that&;s because of
0.359 this simple
0.359 observation so let F be convex and then
0.359 uh it&;s certified this simple inequality
0.359 we move this red one to the right hand
0.359 side and then develop the right hand
0.359 side using the definition of linear
0.359 derivative you can replace it and now uh
0.359 let&;s take take the limit if some go to
0.359 zero
0.359 it gave you this inequality and
0.359 clearly a sufficient term for M to be
0.359 the minimizer is the the blue term that
0.359 I have over there the mrm is equal to
0.359 constant
0.359 okay because once it&;s equal to constant
0.359 since M Prime and M are both probability
0.359 measures the right hand side is canceled
0.359 is zero and it&;s zero for every M Prime
0.359 which means F ocean is m is exactly the
0.359 minimizer of the function of s
0.359 um okay so this gives the sufficient
0.359 information okay by introducing linear
0.359 directive we can characterize the
0.359 minimizer
0.359 in this way
0.359 so let&;s come back to our free energy
0.359 function f Sigma well we record here
0.359 it&;s just there the objective function f
0.359 plus uh small temperature times fissure
0.359 information
0.359 okay first in order to feel that
0.359 efficient information is a legal
0.359 recognizer we have a little serum saying
0.359 if Sigma go to zero to send a minimum of
0.359 f Sigma in fact converge to a minimum of
0.359 for f itself so the regularizer it does
0.359 not cause bias for temperature going to
0.359 zero
0.359 uh the next thing is well we we try I
0.359 already mentioned this so F Sigma is
0.359 strictly comex so if there is a
0.359 minimizer it is unique in certain at
0.359 least in a certain uh subset
0.359 all right uh so uh
0.359 buy some uh secular self violation we
0.359 can uh we can get the first condition
0.359 for this problem
0.359 so
0.359 well let&;s forget about all the details
0.359 so basically it&;s the same thing as the
0.359 as a as a as what is uh written in the
0.359 previous slides
0.359 so here with a regularizer instead of
0.359 the simple inequality we have seen
0.359 before we have this actual blue time and
0.359 this actual gluten effect comes from the
0.359 calculus of variation on the infection
0.359 all right and if you you assume M to be
0.359 regular enough you can simply write the
0.359 blue term in this red term okay simplify
0.359 a little bit
0.359 so uh this also give a sufficient
0.359 condition if the right hand side the
0.359 integral of the right hand side is a
0.359 constant then based on the same logic as
0.359 I explained before it gives a sufficient
0.359 condition for M to be the minimizer and
0.359 since F Sigma is strictly convex then it
0.359 must be the unique minimize
0.359 okay here we introduce a useful notation
0.359 which we will come come back to uh in
0.359 the later slides so um
0.359 as we said this term comes from a
0.359 variation
0.359 vibration so uh basically you can you
0.359 can you can take it as a derivative of
0.359 the free energy function okay FC so we
0.359 just Define this notation F Sigma Delta
0.359 F Sigma over Delta m is equal to this
0.359 term
0.359 okay uh why I say this definition that&;s
0.359 because uh strictly speaking this free
0.359 energy function of Sigma is not a
0.359 differentiable by our previous
0.359 definition
0.359 okay so um
0.359 let&;s look at this uh first other first
0.359 of the equation okay
0.359 if we look at this first the equation in
0.359 fact we can relate to this minimization
0.359 problem to some other interesting
0.359 problems
0.359 okay let&;s take uh this is coming from
0.359 some change of variables so first let&;s
0.359 take the variable uh change our variable
0.359 to psi equal to square of n so I I
0.359 mentioned it before m is the probability
0.359 measures and the square
0.359 root of probability measure in condiment
0.359 hanging mechanics is a wave function and
0.359 if we look at what is the wave function
0.359 certify here
0.359 it&;s in fact the shooting equation the
0.359 mean field shading equation
0.359 the minimizer if M Star the minimizer
0.359 according to the first of the equation
0.359 and five stars the square root of M Star
0.359 I feel certified is uh
0.359 eigenvalue problem of Midfield trading
0.359 operators
0.359 and well in the simple case if f is
0.359 linear
0.359 it&;s our example a if you remember
0.359 then this goes back to the classical
0.359 content mechanics the C is the the
0.359 smallest again value of the shading
0.359 operator and the beside the so-called
0.359 Grand State and and in general if f is
0.359 not linear then this is uh uh this is a
0.359 problem to find ground state for newly
0.359 non-linear trading equation which is a
0.359 well a practical problem in content
0.359 chemistry for example
0.359 so in fact we can treat this as our
0.359 second motivation instead of
0.359 instead of the neural networks
0.359 so another interesting change of
0.359 variable comes here so if we Define u to
0.359 be minus log of n
0.359 okay and in fact we can we observe that
0.359 you certified this erotic
0.359 hey Gothic hdb equation Hamilton
0.359 jacobioma equation
0.359 and if people are familiar with
0.359 literature of a review game then in fact
0.359 this HGB equation categorizes us
0.359 characterize the the value function well
0.359 the character that&;s the optimal control
0.359 of the potential erotic significant game
0.359 okay so by these two change of variable
0.359 you can relate to this new field
0.359 optimization problem to this two
0.359 interesting problems so that&;s a side
0.359 effect
0.359 so now uh let&;s introduce the dynamic
0.359 which approximates the midi the
0.359 minimizer of our Midview optimization
0.359 Problem regularized by fish information
0.359 so again let&;s look at the first or the
0.359 equation if M Star is the optimizer then
0.359 5 Star defined as the square root
0.359 of n STAR certified is shooting a a
0.359 value problem
0.359 and based on this static problem we
0.359 introduce a dynamic version and call it
0.359 Mutual shooting the Dynamics so
0.359 basically we add a DT file here
0.359 and uh
0.359 well a little remark is that well
0.359 I didn&;t mention before uh in fact
0.359 there&;s a linear derivative for is
0.359 defined up to a constant
0.359 but here uh well by Define the equation
0.359 in this way it&;s no longer up to a
0.359 constant but well it&;s not so important
0.359 in this talk
0.359 we are going to show that so let&;s show
0.359 that the the solution Phi T should
0.359 converge to the optimizer file start
0.359 that&;s our objective one of our
0.359 objectives
0.359 so um
0.359 we come back to this
0.359 change of variable we Define Mt which is
0.359 a Phi T Square
0.359 and in fact
0.359 [Music]
0.359 the flow empty certified this so-called
0.359 folk Planck equation
0.359 for people really uh for people familiar
0.359 with the stochastic process then you can
0.359 treat this as a marginal law empty you
0.359 can treat them empty as the marginal law
0.359 of the mean field person that&;s
0.359 diffusion which is death rate which
0.359 in the bracket
0.359 okay we will come back to this point of
0.359 view later
0.359 so uh by some simple calculus we can
0.359 rewrite this equation in this way
0.359 and the recall this red term is
0.359 is the notation we introduced in the
0.359 slides of first other condition so
0.359 called the so-called linear derivative
0.359 of f Sigma of the free energy function
0.359 okay so uh the focal plan equation can
0.359 be simply written as DTM equal to minus
0.359 linear directive times n okay
0.359 keep this in mind and well
0.359 let&;s forget about this uh
0.359 um
0.359 on the other hand you&;ve reached the
0.359 record the other change of variable U
0.359 equal to minus log M then this Dynamic
0.359 version
0.359 is this time version of you 65. also
0.359 certified HGB equation well it&;s no
0.359 longer aquatic HTTP equation it&;s a
0.359 dynamic activity parabolic HTTP equation
0.359 so uh well you can see these three
0.359 equations uh shooting a operators or
0.359 plant equation and the fgb equation are
0.359 one to one to each other
0.359 like a trinity so we will exploit this
0.359 so uh the first thing before starting
0.359 the limit is the world postness of this
0.359 equation of this Midfield equation
0.359 all right so since the gc3 equations are
0.359 the same
0.359 I went to one to each other so we only
0.359 need to pick one it&;s a favorite one to
0.359 study so let&;s pick the Midfield hdb
0.359 equation and prove its web person is
0.359 okay to prove the web hosting is of this
0.359 kind of Midfield equation you only need
0.359 to prove for
0.359 um the fixed point is a contraction here
0.359 so given M bar here you probably into
0.359 this uh
0.359 HTTP equation which will be a classical
0.359 HTTP equation
0.359 and and then Define a function as a
0.359 solution U to this classical hdb
0.359 equation and now Define the new
0.359 probability measure M using this
0.359 potential U
0.359 okay and this gives you the the fixed
0.359 Point mapping M front and bar to m
0.359 and once you prove this mapping admits
0.359 well it is a contraction at the unique
0.359 fixed point then you prove for as a let
0.359 me feel the LGB equation is well post
0.359 and in particular you only need to prove
0.359 for this contraption on a short
0.359 horizontal then you can paste the
0.359 solution to longer Horizons
0.359 so to show you this is what strategy is
0.359 a classical
0.359 just to estimate so first you use some
0.359 stability estimates
0.359 it&;s not difficult to prove that the
0.359 difference between the gradient of U can
0.359 be dominated by constant times
0.359 the Washington distance between these
0.359 two float to two given flow
0.359 M Bar and M Bar Prime
0.359 okay this is stability analysis and the
0.359 second thing is the more what was is
0.359 more original here so in fact using
0.359 reflection coupling argument we may
0.359 prove that
0.359 uh for M and the correspond M Prime the
0.359 West is time one distance between
0.359 Mt and M Prime T can be done dominated
0.359 by
0.359 by the uh by the L Infinity Norm
0.359 of the gradient of the potential okay
0.359 great enough thank you so U is a
0.359 potential of M and U Prime the potential
0.359 for M Prime
0.359 okay
0.359 so once you have these two estimates it
0.359 combine these two you get uh you get
0.359 that the the
0.359 so the wash time distance between the
0.359 new probability of M and the M Prime is
0.359 dominated by constant times the Western
0.359 one distance between the input
0.359 probability flow and bar and then bar
0.359 Prime and since the constant dependency
0.359 in particular when capital T goes to
0.359 zero it goes to zero so when T is more
0.359 enough you get a contraction Contracting
0.359 in this sense
0.359 foreign
0.359 so this is about the world postness and
0.359 in in addition we can in fact have more
0.359 important result about regularities
0.359 again look at this Midfield fjp equation
0.359 okay so here we already saw this uh
0.359 fixed Point problem now we want to look
0.359 into the regularity of this equation
0.359 uh well as an assumption a very
0.359 important function we assume in this
0.359 paper uh the linear derivative of Delta
0.359 of f
0.359 can give can be decomposed in two parts
0.359 capital G and and the small G and small
0.359 G is only a real function capital G it
0.359 has to make few dependence and we assume
0.359 that the part with mean field dependence
0.359 is uniform lipsticks in x
0.359 and the part the real part it&;s a small
0.359 G you can create a physical confinement
0.359 it is strictly convex okay it has a
0.359 lower bound well small C here is the
0.359 first strictly positive okay instead so
0.359 it&;s strictly convex and also the hasher
0.359 has upper bound tools
0.359 well using a probabilistic argument in
0.359 particular we use this techniques in
0.359 forward backward sde and the reflection
0.359 coupling we can prove that the solution
0.359 to this HGB equation also conserve this
0.359 Con this kind of decomposition so namely
0.359 you can be decomposed into two parts w
0.359 where W is ellipsis
0.359 is the capital G and D is
0.359 convex strictly comments as the little G
0.359 okay
0.359 and moreover this coefficients lipsticks
0.359 conversions and the convexity as a
0.359 hinting bonds
0.359 see in the capital c say oh do not
0.359 depend on time so we have uniform in
0.359 time estimate on those regularities
0.359 so this will play a crucial role in the
0.359 later analysis
0.359 go to detail I just tell you very loud
0.359 um
0.359 so um in order to prove
0.359 in order to prove the convergence of
0.359 this so-called meaningful Schrodinger
0.359 Dynamics towards the minimizer of the
0.359 Midfield optimization problem recognize
0.359 the fission information one uh key
0.359 observation is set following energy
0.359 dynamic
0.359 so F Sigma is the and free energy
0.359 function well it&;s ftcf plus fish
0.359 information recognizer
0.359 it once we plug in empties the marginal
0.359 law of the uh it&;s mean for the
0.359 Schrodinger Dynamics then we can compute
0.359 the time derivative of the free energy
0.359 in fact it&;s explicitly written this way
0.359 uh why it&;s so in fact it&;s very easy to
0.359 see it formally well
0.359 to compute the time directory of free
0.359 energy formally it&;s just uh equal to
0.359 this because of the definition of linear
0.359 derivative okay because linear diary
0.359 means the difference between the
0.359 functional value can be written as the
0.359 integral of linear digitive times the
0.359 difference between the matters and
0.359 difference between the measures measures
0.359 in small time it&;s just the DT of Mt
0.359 right
0.359 and now you develop DC of Mt using the
0.359 folk Planck equation we we have
0.359 introduced before and using our notation
0.359 of Delta F Sigma of Delta m
0.359 uh we recall that the focal Planck
0.359 equation can be simply written in this
0.359 way
0.359 and this already gave you the result so
0.359 that&;s why I say forward it&;s very easy
0.359 to obtain
0.359 and to appropriate rigorously we rely on
0.359 exactly the regularity results we have
0.359 shown before in the previous slides
0.359 okay
0.359 um
0.359 and this intuitively in fact I already
0.359 gave you the convergence result why
0.359 that&;s because
0.359 we recall that
0.359 the first other condition for M Star to
0.359 be the minimizer
0.359 is
0.359 this this linear data table for free
0.359 energy function equal to zero
0.359 okay this is a sufficient condition and
0.359 if you look at this energy Dynamics it
0.359 says that is along the midfielder
0.359 shooting
0.359 Dynamics energy always decrease at this
0.359 speed and that&;s that is it will always
0.359 decrease until the speedical touch zero
0.359 right and once the speed touches zero
0.359 you evoke the first other condition you
0.359 you touch the minimizer
0.359 okay so that&;s the that&;s the spirit
0.359 that&;s the intuition behind the proof
0.359 and this is the the result says well
0.359 here I don&;t show the real proof it&;s a
0.359 little bit uh technical
0.359 but eventually we can we can follow this
0.359 um
0.359 intuition and prove that the marginal
0.359 law of the refrigerating Dynamics empty
0.359 really converge to the minimized M star
0.359 in the sense of w in the sense of
0.359 Washington 1 okay
0.359 and in fact we can do much better
0.359 we can prove the exponential convergence
0.359 using the convexity of f that&;s a bunker
0.359 hey inequality and the regular
0.359 regularity estimator I showed before
0.359 we can prove this kind of inequalities
0.359 so in fact this the rate of the the
0.359 agreement of energy dominate
0.359 uh so the constant times the difference
0.359 between
0.359 the current energy and the minimal
0.359 energy
0.359 because we already Pro M Infinity the M
0.359 Star if you remember
0.359 okay and and well this gives you uh the
0.359 explanation
0.359 convergence by uh by groundwater
0.359 inequality as you can see
0.359 and the convergence rate is C
0.359 and this constancy depends on the
0.359 coefficient of bronchi inequality and
0.359 and the estimate we have for
0.359 uh we have about 10 to 40 regularities
0.359 so
0.359 um till now we have proved the Midfield
0.359 shooting a Dynamics converge to the B
0.359 field optimization Problems regularized
0.359 by the fission information
0.359 okay so it to offer and a
0.359 an approximation to the minimizer but in
0.359 fact you can
0.359 extract the infinite number of uh
0.359 stochastic process which approximate
0.359 this kind of uh
0.359 Target measure why this means you the
0.359 shooting Dynamics is optimal in certain
0.359 sense that draw us to the study of
0.359 gradient flow
0.359 okay so let&;s recall what is gradient
0.359 flow let&;s
0.359 um first record the basic case in the
0.359 UCD space
0.359 so in the utility in uh space we know
0.359 that the gradient flow is expressed in
0.359 this
0.359 ode right
0.359 so f is the potential is the landscape
0.359 then green flux is looking this way and
0.359 what is it so we can in fact write it in
0.359 another way so let&;s first expand the
0.359 increase in the older scheme
0.359 what is uh continuous time Dynamics
0.359 and the older scheme the increasing
0.359 older scheme can be read as a virational
0.359 problem okay in fact the given X at time
0.359 I
0.359 you can solve a vibrational problem to
0.359 get to the value of x at time I plus 1
0.359 which time step h
0.359 okay
0.359 so this is a so-called virational
0.359 formulation for the gradient flow and uh
0.359 we also recall the similar work of jko
0.359 for the gradient flow on the probability
0.359 major space
0.359 uh on the on the western to space let&;s
0.359 see let&;s I was in two space so given a
0.359 function
0.359 function of probability measure F okay
0.359 we can define a similar uh virational
0.359 problem to determine a gradient flow all
0.359 right and what is interesting is when
0.359 you
0.359 well here here sorry let&;s be more
0.359 precise so uh so here instead of the you
0.359 see the distance you replace it by
0.359 Watson two distance
0.359 okay and what makes a story more
0.359 interesting is when you add the entropy
0.359 as the regularizer okay you add the
0.359 entropy as a regularizer to this
0.359 objective function this potential
0.359 function f
0.359 then uh the jko paper tells us
0.359 uh this discrete time uh flow image will
0.359 converge to a continuous time
0.359 flow which solves this fog plant
0.359 equation which solves this functional
0.359 equation and as we all know this folk
0.359 Planck equation is in indeed the
0.359 marginal law of the Link Field launch
0.359 Dynamics
0.359 foreign
0.359 flow
0.359 for function or regularized by relative
0.359 entropy in the in the washes time to
0.359 space
0.359 so here we in fact we have similar story
0.359 for the Midfield shooting Dynamics so
0.359 what we should do is replace so there
0.359 are two Replacements let&;s see first we
0.359 will replace the metric of the space so
0.359 here we have W2 space the distance
0.359 replace W2 Distance by the relative
0.359 entropy
0.359 okay oh well more precisely the W2
0.359 squared for everything entropy
0.359 okay
0.359 and then uh we also replace the
0.359 regularizer uh
0.359 the regular the previous regulator uh
0.359 relative entropy by the new recognizer
0.359 feature information so
0.359 after these two replacement
0.359 we Define in fact we are trying to
0.359 define the
0.359 gradient flow
0.359 of this regularized potential
0.359 in the space measured by relative Matrix
0.359 okay and what we will get
0.359 in fact we will get our meet with the
0.359 shooting Dynamics
0.359 and here I show it a formally
0.359 so let&;s recall the first of all the
0.359 equation it say that well if m i plus 1
0.359 so this mean field of Midfield
0.359 optimization problem
0.359 then is the uh
0.359 solve this first other condition
0.359 and
0.359 we can prove that when X go to zero
0.359 this timestamp goes zero this discrete
0.359 time of flow will converge to a
0.359 continuous time flow n
0.359 and what&;s the limit formally you can
0.359 see that the limit should read in this
0.359 way right
0.359 normally
0.359 and this is nothing but the
0.359 midfieldership Dynamics well because log
0.359 TT logged log is equal to DTM over M
0.359 right so that&;s how we get it
0.359 so this is a formal proof but in the in
0.359 the paper we can prove in in
0.359 rigorous manner
0.359 so um well so we have shown that
0.359 Midfield is showing a Dynamics converge
0.359 to the minimizer when the problem is
0.359 legalized by the feature information
0.359 also we show that this Dynamics is in
0.359 fact
0.359 optimal in the sense of gradient flow
0.359 with respect to relative Matrix so
0.359 another problem which is important to to
0.359 to complete the story is
0.359 why this is useful in other words how
0.359 can we simulate this kind of form if
0.359 you&;re the shooting Dynamics to
0.359 approximate the minimizer right
0.359 so um to make a story simple let&;s
0.359 introduce the numerical scheme
0.359 in the linear case so F in linear here
0.359 okay linear expectation of small f
0.359 then the focal Planck equation can be
0.359 written this way as we said it is uh it
0.359 describes the marginal law of a person
0.359 that&;s diffusion with the gas rate equal
0.359 to this term okay equal distance and CT
0.359 is just the regularization it&;s the
0.359 normalization constant so we don&;t need
0.359 to care now
0.359 so maybe it won&;t be interesting
0.359 um well busy so so that is uh to say is
0.359 there is a natural diffusion behind this
0.359 equation but the problem is we do not
0.359 know analytically what is this density
0.359 function M uh so we cannot evaluate it
0.359 numerically this death rate and this uh
0.359 certainly cause problem for numerical
0.359 simulation so how to make a detour
0.359 to this difficulty
0.359 okay how to make this detour let&;s again
0.359 turn to the change of variable so the
0.359 square root so uh Phi is the square root
0.359 of M should specify this uh shooting
0.359 operator
0.359 or you can say imaginary time shooting a
0.359 equation
0.359 if and this city sorry this red city is
0.359 the is the Orange City here it&;s the
0.359 normalization constant
0.359 to make M as a probability measure okay
0.359 in fact you can introduce the two
0.359 scaling of this equation
0.359 okay you can too skinny of this equation
0.359 first scaling is to remove this
0.359 um
0.359 normalization constant C the second one
0.359 is the change this normalization
0.359 constant two to be the normalization
0.359 constant which make
0.359 site Toyota
0.359 to be a probability measure okay recall
0.359 C is the constant which you make M to
0.359 probably measure but now I change the C
0.359 to be C Theta which make uh
0.359 by itself to be Pro BT measure okay
0.359 and uh well y equals scaling in fact
0.359 there&;s a difference between uh Phi and
0.359 the Pi Bar or five and five the other is
0.359 just a times constant times constant
0.359 okay times determines the deterministic
0.359 process
0.359 okay so
0.359 um why I introduced you two skillings in
0.359 fact they are all and you you
0.359 numerically friendly okay so the first
0.359 one it is in fact
0.359 um based on uh female character relation
0.359 we can write out the new as a
0.359 probabilistic representation of uh PSI
0.359 bar
0.359 okay so this one can be simply simulated
0.359 and numerically
0.359 and therefore
0.359 um beside Toyota
0.359 of site Toyota is just the marginal of
0.359 bursts that&;s the diffusion and here
0.359 with test rate F so the previous Pizza
0.359 return
0.359 gradient of log n this build okay that
0.359 that&;s that&;s the thing which we cannot
0.359 evaluate them now in this
0.359 formulation is disappeared so
0.359 ciety become a numerically accessible
0.359 So eventually we are interested in
0.359 empties
0.359 and recall that Mt is the square of
0.359 PSI right and PSI is just a constant
0.359 different from PSI bar and the site here
0.359 that&;s why you can rewrite the PSI
0.359 Square in this way okay this way
0.359 and since the PSI bar can be a
0.359 simulated using this framework
0.359 representation
0.359 and the beside teardrop can be sampled
0.359 as the marginal law of a person that&;s
0.359 diffusion with simple death rate
0.359 so
0.359 so here x i is the sample of this
0.359 percentage diffusion
0.359 and after all we can represent empty
0.359 numerically as the weighted empirical
0.359 measure
0.359 like this okay
0.359 and everything here is a numerically
0.359 accessible
0.359 so uh
0.359 this gives you a way to simulate the
0.359 so-called midfielder shooting the
0.359 Dynamics
0.359 which will approximate the minimizer
0.359 so let&;s uh do a quick numerical test
0.359 so if we take the small F what was doing
0.359 the linear case we take the small F the
0.359 potential uh in this form okay so
0.359 basically the midfielder shooting
0.359 Dynamic should give us the minimizer if
0.359 you have the minimize and which should
0.359 concentrate here right you should
0.359 concentrate here
0.359 and that&;s what it does okay after our
0.359 numerical simulation uh our samples are
0.359 concentrated in this area
0.359 and this is a training error and because
0.359 it&;s a logarithm error sorry log
0.359 training error
0.359 and we can say that it indeed we have
0.359 exponential convergence okay and why I
0.359 choose this particular F this landscape
0.359 this potential function that&;s because
0.359 for this kind of potential function uh
0.359 launched by Dynamics can never work
0.359 because laundromat Dynamics only see the
0.359 gradient of potential and here&;s
0.359 gradients in this interesting area is
0.359 always zero okay so for large one it&;s
0.359 blind it&;s large pandemic is blind
0.359 between
0.359 -5 to 5 let&;s say okay it does not see
0.359 the change of of the of landscape but
0.359 the shading can see it okay so large one
0.359 must have failed in this case by the
0.359 shooting is still works
0.359 okay so uh well let me simply conclude
0.359 my talk
0.359 so we studied the Midfield optimization
0.359 problem uh inspired by the study of a
0.359 neural network
0.359 the training of a neural network to be
0.359 more precise and in this paper we
0.359 focused on the regularization using
0.359 facial information instead of relative
0.359 entropy
0.359 foreign
0.359 Dynamics
0.359 and we find that a free energy always
0.359 decrease around these Dynamics and the
0.359 marginal law of the free of of of the
0.359 mixture sharing Dynamics converge in W1
0.359 or in L1
0.359 to the minimizer exponentially quickly
0.359 and well also show that this Dynamics is
0.359 special in a sense that it&;s a gradient
0.359 flow if you measure the probability
0.359 measure space using relative entropy
0.359 and finally we say that this method is
0.359 also America is is
0.359 also numerically accessible a plant
0.359 implementable by introducing this Monte
0.359 Carlo method and I think even in the
0.359 linear case this Monte Carlo method
0.359 seems to be new to the literature so it
0.359 may have interesting uh application even
0.359 in content chemistry but well I&;m not
0.359 expecting that so if in the audience
0.359 they are exploring that I&;m very
0.359 interested in
0.359 uh further discussion so thank you very
0.359 much for attention that&;s all for my
0.359 presentation thank you very much
0.359 um other questions
0.359 maybe then I can start with a more
0.359 curiosity
0.359 um
0.359 I think it&;s a supernatural problem to
0.359 look at this mean field optimization
0.359 problem but you you motivated it by this
0.359 uh fitting of neural networks so just
0.359 wondering if you can now somehow come
0.359 back to this program do you learn
0.359 something now that you can use to
0.359 understand this calibration problem or
0.359 fitting problem of neural networks
0.359 better
0.359 okay better I cannot say that because
0.359 um that&;s what&;s this point you&;re a
0.359 little bit so
0.359 um why it&;s not a I cannot say better
0.359 that&;s because
0.359 uh if you so here I introduced the
0.359 algorithm for the linear case and we
0.359 eventually you see you have to do this
0.359 simulation and this simulation in fact
0.359 is very efficient in in your case okay
0.359 for a capital f is the linear
0.359 expectation of smart F it&;s very
0.359 efficient because
0.359 because
0.359 in fact the only need the mt40 beginner
0.359 right
0.359 so in the linear case in fact you don&;t
0.359 need to evaluate the weight of this
0.359 weighted empirical law you only need to
0.359 sample
0.359 sample the
0.359 marginal law of the percent test
0.359 diffusion with a test with the app you
0.359 can sample it at Capital until capital T
0.359 okay and then at capital T you calculate
0.359 this weight once only once once more
0.359 then you get your empty right so this is
0.359 a very uh efficient algorithm but in the
0.359 non-linear case since in your potential
0.359 it&;s not simply F anymore it&;s Delta F
0.359 over Delta m
0.359 and it depends on M so you need to
0.359 compute m t for each small t
0.359 okay which means you need to calculate
0.359 the weight for each commodity to update
0.359 all right so uh this will uh this will
0.359 add quite much complicity to the
0.359 algorithm
0.359 but it still works just a little bit
0.359 slower
0.359 okay but uh well uh the the thing is in
0.359 some particular places well for example
0.359 here that&;s why I mentioned this example
0.359 because the landscape is irregular but
0.359 they do not need to be so good it just
0.359 need to be the gradient to be very big
0.359 at some place and then the gradient
0.359 descent algorithm will fail because it
0.359 will ignore this kind of jump
0.359 right but uh this kind of person that&;s
0.359 driven I will can avoid this kind of
0.359 bias that&;s the advantage and also if if
0.359 we go back to the its ritual Continental
0.359 chemistry events
0.359 um it&;s more nature to study this kind
0.359 of product
0.359 um I got a another question as a private
0.359 message in this chat so let me read it
0.359 to you
0.359 um so the question is visual information
0.359 is also used as another optimization
0.359 method using natural gradient is there
0.359 any connection between using natural
0.359 gradient and your method
0.359 so uh let me see information is what
0.359 um
0.359 let&;s think
0.359 well first I uh
0.359 I don&;t know what does it mean nature
0.359 gradient
0.359 it can also not explain the question
0.359 further so
0.359 um
0.359 I&;m sure it has other experience
0.359 application application information it&;s
0.359 a common
0.359 common notion in the information Theory
0.359 right
0.359 um other further questions
0.359 also ask additionals thanks
0.359 um it relates to this place basically uh
0.359 in terms of the information geometry
0.359 what kind of extensions could I mean is
0.359 there any special logic wire feature
0.359 information or care works could this be
0.359 extended
0.359 um to other informational measures and
0.359 also in terms of the divergences the
0.359 fashion medicine one and two could this
0.359 be extended say to coolback library and
0.359 similar Divergence measures even Brigman
0.359 that are used
0.359 um in probably such optimization
0.359 problems
0.359 um okay that&;s a that&;s a good question
0.359 and
0.359 let me think a little bit
0.359 um
0.359 yes so um
0.359 so
0.359 uh
0.359 were you asking what is it as a general
0.359 theory behind all those in your head
0.359 like but well it&;s it&;s hard to say well
0.359 the short answer it&;s hard to say
0.359 because you can see that in the well
0.359 let&;s go back to the beginning in
0.359 previous work we studied the regularizer
0.359 and using uh relative entropy right the
0.359 care of the Avengers and this leads to
0.359 the laundries or well more generally the
0.359 neutral Dynamics and just to replace the
0.359 relative entropy by the fish information
0.359 that leads us to a completely different
0.359 Dynamics so
0.359 so I don&;t see how you can propose a
0.359 general uh theorem which should say that
0.359 you can use different information uh
0.359 geometry to
0.359 [Music]
0.359 you know you can you can combine all
0.359 those geometry into into the general
0.359 theorem maybe just is there any special
0.359 logic I mean could this work for in
0.359 general information that measures as
0.359 regularization here or in their special
0.359 logic why picture information is key for
0.359 this aspect
0.359 uh
0.359 so uh well I can only compare uh to the
0.359 the segregative entropy and the official
0.359 information so here in relative entropy
0.359 at the end well let&;s go back to the
0.359 the numerical example you can see it
0.359 more intuitively so as I say there&;s a
0.359 feature information is corresponding to
0.359 the last one Dynamics which is the first
0.359 order
0.359 algorithm right it&;s a it&;s a it&;s a
0.359 diffusion with drift which is a gradient
0.359 type of potential
0.359 and so in other words it&;s a first order
0.359 algorithm so if you look at our case if
0.359 the problem is regulates the beneficial
0.359 information then the Dynamics the
0.359 training of Dynamics is impacted a zero
0.359 order
0.359 algorithm it&;s it&;s only you only need
0.359 to evaluate the potential itself you
0.359 don&;t need to have access to the
0.359 gradient
0.359 okay so by changing this information
0.359 geometry in fact your gradient change uh
0.359 dramatically from A first order
0.359 then it will make to a zero academic
0.359 so I don&;t know why
0.359 it&;s clear enough yeah thanks thanks
0.359 thank you
0.359 are there further questions
0.359 all right
0.359 that&;s not the case then thank you again
0.359 very
0.359 variance talks thank you yes usually we
0.359 have a next seminar in two weeks from
0.359 now
0.359 thank you
.

Vous pouvez lire ce post traitant le sujet « chasteté ». Il est produit par la rédaction de chastete.fr. Le site chastete.fr a pour objectif de publier plusieurs articles autour du sujet chasteté développées sur le web. Cette chronique se veut générée de la façon la plus juste qui soit. Pour émettre des remarques sur ce dossier autour du sujet « chasteté » prenez les contacts indiqués sur notre site internet. En consultant régulièrement nos contenus de blog vous serez au courant des futures parutions.